1CR0C0PY  RESOLUTION  TEST  CHART 

NATIONAL  BUREAU  OF  STANDARDS-  1963-A 


4 


REPORT  DOCUMENTATION  PAGE 


i  report  number 


READ  INSTRUCWNS 
BEFORE  COMPLETING  FORM 


2.  GOVT  ACCESSION  NO.I  3.  RECIPIENT'S  CATALOG  NUMBER 


AFIT/CI/NR  86-1 15D 


4.  TITLE  (end  Subtitle) 

A  Communications  Bandv/idth  Model  for  Shuffle- 
Exchange  and  Augmented  Shuffle-Exchange 
Interprocessor  Communication  Networks 


5.  TYPE  OF  REPORT  4  PERIOD  COVERED 


/Ml£6/L6 /DISSERTATION 


PERFORMING  ORG.  REPORT  NUMBER 


Au  THORf*; 


Charles  Robiou  Bisbee  III 


8.  CONTRACT  OR  GRANT  NUMBER^ 


PERFORMING  ORGANIZATION  NAME  AND  ADDRESS 

AFIT  STUDENT  AT:  Auburn  University 


10.  PROGRAM  ELEMENT.  PROJECT.  TASK 
AREA  &  WORK  UNIT  NUMBERS 


II.  CONTROLLING  OFFICE  NAME  AND  ADDRESS 

AFIT/NR 

WPAFB  Oli  45433-6583 


12.  REPORT  DATE 


13.  NUMBER  OF  PAGES 


14.  MONITORING  AGENCY  NAME  ft  AOORESS (II  dllterent  from  Controlling  Oltice)  *5.  SECURITY  CLASS,  (ol  this  report) 

UNCLAS 


15..  DECLASSIFICATION/ DOWNGRADING 
SCHEDULE 


i6.  distribution  statement  <oi  this  Report) 


APPROVED  FOR  PUBLIC  RELEASE;  DISTRIBUTION  UNLIMITED 


OTIC 


AUG  2  8  1986 


17.  DISTRIBUTION  STATEMENT  (ol  (he  obstruct  entered  In  Block  20,  II  dlllerent  from  Report) 


IB.  SUPPLEMENT  ARY  NOTES 


APPROVED  FOR  PUBLIC  RELEASE:  IAW  AFR  190-1 


19.  KEY  WORDS  (Continue  on  reverse  aide  if  necessary  ond  Identity  by  block  number) 


LAVE 

I /  W  •  .  TflP 

Dean  for  Research  and' 
Professional  Development 
AFIT/NR 


20.  ABSTRACT  (Contlnuo  on  reverse  side  It  necessary  end  Identity  by  block  number) 


ATTACHED . 


OTIC  FILE  COP^ 


DD  I  JAN  73  1473  EDITION  OF  1  NOV  65  IS  O0SOLE 


SECURITY  CLASSIFICATION  OF  THIS  PAGE  flHien  Onto  r.ntered) 


DISSERTATION  ABSTRACT 


A  COMMUNICATIONS  BANDWIDTH  MODEL  FOR  SHUFFLE-EXCHANGE 
AND  AUGMENTED  SHUFFLE-EXCHANGE  INTERPROCESSOR 
COMMUNICATION  NETWORKS 


Charles  Robiou  Bisbee  III 

Doctor  of  Philosophy,  June  6,  1986 
(M.  S.,  Stanford  University  1971) 

(B.  S.,  U.  S.  Air  Force  Academy  1970) 

128  Typed  Pages 

Directed  By  Victor  P.  Nelson 


A  failure  dependent  bandwidth  model  for  shuffle 
exchange  (S/E)  and  augmented  shuffle  exchange  (S/E+) 
interconnection  networks  is  presented.  The  models  are 
based  on  probabilities  of  either  data  or  address  mode 
failures  for  the  individual  binary  switches  which  comprise 
the  SE  or  SE+  network.  The  model  gives  the  expected 
bandwidth  as  a  function  of  the  probability  of  failures  in 
these  switches.  The  model,  which  is  consistent  with  those 
previously  published  when  the  probability  of  failure  is 
zero,  is  first  developed  for  the  S/E  network.  This  model 
is  extended  to  the  S/E+  network  by  developing  a  special 
model  for  the  input  stage  of  the  S/E+  network  and  then 
proving  that,  to  within  a  close  rpproximation,  the 
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conditions  necessary  for  the  S/E  model  hold  at  the  outputs 
of  first  stage  of  the  S/E+.  The  model  is  verified  using  a 
computer  simulation.  An  example  is  presented  which 
demonstrates  use  of  the  model  to  predict  the  effects  of 
several  fault  tolerance  schemes  on  the  bandwidth  of  these 
networks.  The  model  demonstrates  that,  when  used  as  a 
reliability  enhancement,  the  extra  stage  of  the  S/E+  causes 
Sa  reduction  in  bandwidth  as  compared  to  an  S/E  network.  *■ 
Thus  the  S/E+  network  increases  the  probability  that  any 
single  processor— memory  connection  can  be  supported  at  the 
expense  of  network  throughput.  The  primary  uses  of  the 
bandwidth  models  presented  here  are  as  a  network  design 
parameter  and  as  a  measure  to  evaluate  the  cost 
effectiveness  of  proposed,  switch  level,  reliability 
enhancements . 
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I.  INTRODUCTION 


Connecting  Networks 

The  computational  capacity  of  modern  computers,  while 
immense  by  the  standards  of-  even  a  decade  ago,  fails  to 
meet  the  requirements  of  many  currently  relevant  problems. 
Until  the  recent  past,  advances  in  computational  capacity 
were  gained  by  increasing  the  speed  and  decreasing  the  size 
of  the  components  from  which  computers  are  constructed. 

The  gains  in  computational  capacity  which  can  be  expected 
as  a  result  of  further  advances  in  semiconductor  tech¬ 
nology,  while  important,  cannot  alone  meet  the  growing 
requirements  of  modern  problems. 

The  most  promising  approach  for  the  development  of 
the  next  generation  of  computers  lies  in  the  development  of 
large  parallel  processing  arrays.  Such  processors  are 
composed  of  a  large  number  of  individual  processing  ele¬ 
ments  which  communicate  over  an  interconnection  and  com¬ 
munication  network  (ICN).  As  illustrated  in  figure  1,  the 
ICN  may  be  as  simple  as  a  single  bus  structure,  in  which  a 
communication  path  can  only  be  established  between  a  single 
pair  of  elements  at  a  time,  or  as  complicated  as  a  full 
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BUS  COr*CCTED  SYSTEM 


CROSS  BAR  COWCCTFD  SYSTEM 


Figure  1  —  Cross  Bar  and  Bus  Connected  Sy stems 
crossbar  network  which  allows  a  communication  path  to  be 
established  between  any  free  pair  of  elements. 
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The  single  bus  structure  is  simple  and  grows  in 
complexity  linearly  as  the  number  of  processing  elements 
attached  to  it  increases.  The  communication  capacity, 
however,  is  not  sufficient  for  meaningfully  sized  processor 
arrays.  The  full  crossbar  on  the  other  hand,  provides  a 
high  communication  capacity  but  grows  as  the  square  of  the 
number  of  connected  elements.  For  large  processor  arrays 
connected  by  a  crossbar  system,  the  cost  and  complexity  of 
the  I CM  dominate  the  system.  Simpler  ICNs,  which  provide 
reasonable  communication  capacity  and  whose  growth  is  loga¬ 
rithmic  with  the  number  of  processing  elements  have  been 
proposed  by  several  researchers.  Among  these  are  the  in¬ 
direct  binary  n  cube [24],  the  omega  network  117],  the 
regular  banyan  [10],  and  the  shuffle— exchange  network [31], 
some  of  which  are  illustrated  in  figure  2. 

Previous  Work 

The  study  of  connecting  networks  and  their  switching 
capabilities  found  initial  importance  in  the  telephone 
switching  network.  An  excellent  summary  of  this  work  was 
written  by  Benes(3]  in  1965.  Stone[31]  first  proposed  the 
perfect  shuffle  inteconnection  pattern  as  a  useful  permu¬ 
tation  generator  for  use  in  parallel  processing  applica¬ 
tions.  In  1975  Lawrie[17]  proposed  and  analyzed  the 


SHUFRE  EXCHANGE 


AUGMENTED  SHUFFLE  EXCHANGE 


Figure  2  —  Examples  of  Binary  Cross  Bar  ICNs 
capabilities  of  the  omega  network.  This  network  was 
designed  to  access  and  distribute  the  inputs  and  outputs  of 
N  processing  elements  over  N  separate  memories  so  as  to 
facilitate  parallel  vector  computations.  The  omega  network 
was  made  up  of  log2N  stages.  Each  stage  consisted  of  N/2 
binary  (2  input  -  2  output)  crossbar  switches.  The  stages 
were  interconnected  using  a  perfect  shuffle  wiring  pattern. 
The  omega  network  is  topologically  equivalent  and  the  name 
is  now  synonymous  with  the  shuffle— exchange  network  which 
will  be  discussed  in  this  paper.  Lawrie's  work  was  a  major 
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contribution  in  that  it  not  only  described  the  network’s 
topology  but  also  analyzed  the  network  in  terms  of  the 
allowable  permutations  as  applied  to  the  particular  problem 
of  data  access  in  a  single  instruction  multiple  data  (SIMD) 
type  machine.  Subsequently  a  number  of  researchers  have 
proposed  networks  which  are  similar  in  structure  and  topo¬ 
graphy  but  which  appear  to  offer  advantages  for  particular 
applications.  Notable  among  these  were  the  indirect  binary 
n  cube  by  Pease[24]  and  the  delta  by  Patel[23].  Pease  pro¬ 
posed  his  network  for  application  to  multiple  instruction 
multiple  data  (MIMD)  machines.  Patel's  paper,  while  impor¬ 
tant  at  the  time  because  it  presented  a  new  network,  is 
more  important  in  that  he  developed  a  performance  analysis 
in  terms  of  the  bandwidth  of  his  network  for  MIMD  applica¬ 
tions.  Pease  defined  bandwidth  as  the  number  of  simulta¬ 
neously  active  connections  the  network  could  support.  This 
is  the  definition  that  will  be  used  in  this  work. 

In  1980  and  81  Peng  and  Wu[ 37-38]  and  Parker [22] 
proved  that  many  of  the  previously  proposed  networks, 
including  the  omega  or  shuffle-exchange,  the  binary  n-cube, 
the  data  manipulator,  the  flip  network,  the  delta  network, 
the  regular  banyan  and  one  form  of  the  Clos  network,  were 
topologically  equivalent.  This  result  is  important  in  that 
given  that  this  is  true,  performance  analysis  done  for  any 
network  in  the  class  is  generally  applicable.  In  1983 
Bhuyan(5]  generalized  the  theory  of  these  networks  by 
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analyzing  a  network  composed  of  mixed  radix  crossbar 
switches  as  opposed  to  the  fixed  radix,  binary  crossbars 
generally  used  to  form  the  networks  prior  to  this.  In  1983 
Padmanaban [ 21 ]  studied  the  addition  of  an  extra  stage  to 
these  networks.  The  extra  stage  provided  a  reliability 
enhancement  by  providing  r  paths,  independent  except  for 
the  first  and  last  stages,  between  any  pair  of  ports,  where 
r  is  the  radix  of  the  crossbars  used  to  form  the  network. 
This  scheme  when  applied  to  the  S/E  ICN  is  the  S/E+  which 
will  be  analyzed  in  this  work. 

Several  researchers  have  proposed  additional  enhance¬ 
ments  to  the  basic  network  which  are  designed  to  improve 
the  fault  tolerance  of  the  networks.  Adamsfl]  proposes  the 
use  of  bypass  stages  for  the  first  and  last  stage  of  the 
network-  Tzeng[34]  proposes  the  addition  of  intra-stage 
links  to  reroute  misdirected  communication  links. 

Kumar{16]  suggests  minimizing  the  network  by  removing 
inter— stage  links  not  used  for  required  permutations  in  an 
SIMO  machine. 

Many  of  the  above  cited  references  contain  perform¬ 
ance  analyses.  The  most  notable  re  Patel [23]  and 
Pease{24].  Other  papers  exist  which  focus  on  performance 
analysis  as  opposed  to  network  topology.  Dias[8)  provides 
a  performance  analysis  of  a  buffered,  packet  switched  delt-i 
network  in  a  fault  free  state.  Thanawastienf 33 ]  provides  a 
Markov  chain  traffic  model  of  a  fault  free  shuffle-exchange 
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network.  Kruskal(15]  studies  the  performance  of  multistage 
networks  in  a  packet  switching  mode.  Cherhassky [ 7 ] 
provides  equations  that  can  be  used  to  calculate  the 
probability  that  a  given  pair  of  elements  can  communicate 
using  an  ICN  whose  switches  fail  in  a  data  or  broken 
connection  mode.  Shen(27-281  provides  a  method  fr : 
determining  the  sets  of  switch  failures  which  are  critical 
in  the  sense  that  they  disconnect  the  network  into  two  or 
more  disjoint  sets  of  processing  elements. 

Several  experimental  multiprocessor  systems  have  been 
built  which  utilize  the  above  networks  for  system 
interconnection.  The  Auburn  Fault-Tolerance  Distributed 
Computing  Laboratory  currently  includes  a  four  processor, 
four  memory  parrallel  system  which  utilizes  a  circuit 
switched  4x4  shuffle  exchange  network  as  the  system 
ICN[20].  This  system  is  designed  for  experiments  in  fault 
tolerance  and  system  software  as  related  to  multiprocessor 
systems.  The  Texas  Reconf igurable  Array  Processor (TRAC) 
consists  of  a  16  processors  and  81  memories  /  10  ports 
connected  by  a  4  stage  banyan  network  utilizing  switches 
with  2  inputs  and  three  outputs  each ( 14 , 25 , 26  ) .  This  ICN 
operates  in  a  mixed  circuit  switched,  packet  switched  mode. 
The  TRAC  system  is  designed  for  experiments  in  software  and 
hardware  integration  on  complex,  multiprocessor  systems. 

The  NYU  *  Ultracomputer '  group  has  conducted  extensive 
studies  on  the  architectural  requirements  for  a  4096x4096 
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system  utilizing  an  omega  ICN  in  the  packet  switched 
mode[ 9 , 11 , 12 J .  An  experimental  8x8  system  is  currently 
being  implemented. 

The  properties  of  these  networks  have  been  investi¬ 
gated  by  many  researchers.  Their  efforts  have  concen¬ 
trated,  however,  on  either  analyzing  the  throughput  and 
permutation  capabilities  of  such  networks  in  the  fully 
functional  state  or  on  designing  and  analyzing  the  fault 
tolerant  capabilities  of  the  network.  Little  if  any  work 
has  been  done  in  the  area  of  reliability  measures  for  these 
networks.  Such  measures  are  needed  to  evaluate  the  effec¬ 
tiveness  of  fault  tolerant  system  designs  employing  these 
networks  as  ICNs.  The  purpose  of  this  work  is  to  develop  a 
model  which  predicts  the  bandwidth  of  the  ICN  as  a  function 
of  failure  parameters  for  the  switches  which  comprise  the 
ICN.  Such  a  model  can  then  be  used  to  evaluate  the  effects 
of  proposed  reliability  enhancements  on  the  system  band¬ 
width.  Models  which  relate  ICN  performance  measures  to 
failure  characteristics  of  the  ICN  switches  are  critical 
for  the  design  of  fault  tolerant  systems  and  for  evaluating 
the  cost  and  performance  effects  of  proposed  reliability 
measures . 

Fault  Tolerance  and  Reliability 

Fault  tolerance  is  defined  as  'the  correct  execution 
of  a  specified  algorithm  in  the  presence  of  defects '[ 29 J . 
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The  repeated  and  regular  nature  of  large  scale  processing 
arrays,  coupled  with  the  relatively  high  probability  that 
one  or  more  of  the  system  elements  is  defective  at  any 
time,  demands  that  such  systems  be  designed  to  tolerate 
faults.  If  the  overall  system  is  to  tolerate  faults  and 
continue  to  operate  correctly,  the  ICN  must  be  capable  of 
functioning  in  the  presence  of  internal  faults.  Thus,  in 
terms  of  the  ICN,  fault  tolerance  is  defined  as  the  ability 
to  meet  system  communication  demands  in  the  presence  of 
failures  internal  to  the  ICN. 

Reliability  as  a  function  of  time  is  defined  as  “the 
conditional  probability  that  the  system  has  survived  the 
interval  [0,t],  given  that  it  was  operational  at  time  t=0* 
(29].  In  terms  of  the  ICN,  reliability  can  be  defined  as 
the  probability  that  the  ICN  is  able  to  meet  the  communi¬ 
cation  requirements  of  the  system  at  time  t,  given  that  the 
ICN  was  fault  free  at  time  0.  Such  a  definition  requires 
one  to  define  'communication  requirements'  for  a  system  of 
parallel  processing  elements  and  their  associated  memory 
units.  Two  measures  of  communication  capability,  bandwidth 
and  connectivity,  can  be  used  to  specify  the  requirements 
of  such  a  system.  Cherhassky  et.  al.[7]  have  developed 
models  which  predict  the  probability  that  a  communication 
channel  can  be  established  between  a  randomly  selected  pair 
of  processing  elements  for  a  class  of  the  above  ICNs,  in 
the  presence  of  data  type  faults  in  the  underlying  switches 
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that  comprise  the  ICN.  Several  researchers  [23,33]  have 
developed  bandwidth  models  for  the  ICNs  mentioned  above. 
These  models  are,  however,  only  valid  in  the  fault  free 
case.  The  purpose  of  this  work  is  to  develop  and  present  a 
model  which  can  be  used  to  predict  the  bandwidth  of 
shuffle— exchange  (S/E)  and  augmented  shuffle-exchange 
(S/E+)  interconnection  networks,  given  a  failure  model  for 
the  switches  that  comprise  these  networks. 

S/E  and  S/E4-  Networks 

The  binary  crossbar  S/E  network  is  composed  of  log2  N 
identical  stages,  where  N  is  the  number  of  processors  and 
memories  connected  to  the  network.  Each  stage  consists  of 
N/2  binary  crossbar  switches.  The  stages  are  intercon¬ 
nected  by  a  perfect  shuffle  wiring  pattern.  Such  a  network 
admits  a  simple,  distributed  control  algorithm.  Each 
switch  within  the  network  is  set  according  to  the  corres¬ 
ponding  digit  in  the  binary  number  of  the  memory  desired. 
Figure  3  shows  an  8x8  S/E  network.  The  bold  lines  indicate 
the  switch  settings  required  for  processor  5  to  access 
memory  3.  At  each  stage  within  the  network,  the  cor¬ 
responding  switch  is  set  so  that,  if  the  corresponding  bit 
of  the  desired  memory  address  is  0  then  the  input  is 
connected  to  the  upper  output.  If  the  corresponding  bit  is 
1,  the  input  is  connected  to  the  lower  output.  Figure  4 
shows  the  four  possible  input  output  combinations  that  can 
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Figure  3  -  8  x  8  3E 
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Figure  4  —  Allowable  Requests 


be  requested.  In  this  model  the  switch  itself  can  only 
support  one  of  two  possible  configurations  at  any  one  time. 
These  are  the  X  and  T  states,  and  are  shown  in  figure  5. 

In  the  more  general  case  the  switches  can  also  support  a 
broadcast  mode  in  which  an  input  is  connected  to  more  than 
one  output.  The  analysis  of  faults  in  a  system  which 
utilizes  the  broadcast  mode  is  beyond  the  scope  of  this 


work . 
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When  two  requests  for  different  configurations  are 
present  at  the  inputs,  one  and  only  one  can  be  satisfied. 
The  other  is  blocked  and  cannot  be  completed  until  the 
switch  is  released.  In  this  analysis  it  is  assumed  that 
the  arbitration  between  conflicting  inputs  is  random  with 
either  of  the  switch  inputs  equally  likely  to  have  its 
request  satisfied  in  a  given  cycle. 


Figure  5  -  Allowable  Switch  States 


Figure  6  -  8*8  3E+  Processor  5  to  Memory  3  Connection 
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Figure  6  shows  an  8x8  S/E+  network.  This  network  is 
identical  to  the  S/E  network  except  that  it  has  an  extra 
stage[21].  In  this  model  the  extra  stage  is  used  only  for 
fault  tolerance.  The  base  S/E  network  provides  only  a 
single  path  from  any  input  to  any  output,  while  the  S/E+ 
network  provides  two  paths  from  any  input  to  any  output. 
These  paths  are  disjoint  with  the  exception  of  the  first 
and  last  stage  switches.  The  bold  paths  show  the  two  paths 
that  can  be  used  to  connect  processor  5  to  memory  number  3. 

A  number  of  different  control  strategies  can  be 
developed  for  the  S/E+  network.  These  can  be  designed  to 
provide  either  performance  improvement  or  increased  fault 
tolerance  as  compared  to  the  base  network.  In  this  study, 
it  is  assumed  that  the  extra  stage  is  used  only  for  relia¬ 
bility  enhancement.  The  control  strategy  considered  in 
this  study  is  as  follows:  Each  first  stage  switch  is 
controlled  with  the  first  bit  of  the  corresponding 
processor  ID  until  an  error  is  detected.  Once  any  error  is 
detected  along  the  path  to  a  memory,  the  first  bit  of  the 
address  used  to  access  that  memory  is  inverted  for  all 
subsequent  access  attempts  for  that  processor-memory  pair. 
Thus  all  memory  requests  will  initially  attempt  to  set  the 
first  column  switches  in  the  T,  or  through'  configuration. 
As  failures  which  produce  errors  in  accessing  a  memory 
occur,  the  processors  will  change  the  requests  going  to 
that  memory  so  as  to  request  an  X  configuration  at  the 
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first  stage.  The  remaining  columns  in  the  network  are 
controlled  in  the  same  manner  as  the  base  S/E  network. 

Sucn  a  control  strategy  requires  little  or  no  intelligence 
in  the  ICN  and  limits  the  need  for  substaintial  logic  in 
each  switch. 

Importance  of  a  Bandwidth  Model 

Designers  of  large  scale,  fault  tolerant,  parallel 
systems  must  make  many  design  choices  regarding  the  ICN  to 
be  used.  Currently  available  design  tools  are  not  suffi¬ 
cient  to  assist  and  guide  those  choices.  The  equations 
derived  by  Cherhassky [ 7 ]  provide  a  guide  to  calculating 
the  probability  that  a  random  pair  of  elements  can  suc¬ 
cessfully  communicate  over  an  ICN  which  may  have  experi¬ 
enced  data  mode  failures.  The  bandwidth  model  provided  by 
Patel [23]  can  only  be  used  to  estimate  the  ICN  bandwidth 
prior  to  the  occurrence  of  faults  within  the  ICN.  The 
model  derived  by  Shen[ 27,28]  can  be  used  to  identify  the 
faults  which  can  disconnect  the  system  but  cannot  determine 
the  effect  of  these  or  other  faults  on  communications 
bandwidth.  With  the  exception  of  Cherhassky’ s  equation 
none  of  these  tools  estimates  the  performance  of  the  ICN  in 
the  presence  of  expected  failures. 

In  chapter  2  a  bandwidth  model  for  the  SE  network  is 
presented.  This  model  gives  the  expected  bandwidth  as  a 
function  of  failure  parameters  of  the  component  switches  of 
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the  network.  In  chapter  3  this  model  is  extended  to  cover 
the  S/E+  network.  In  chapter  4,  in  order  to  verify  the 
model,  comparison  of  the  results  of  the  model  and  a  simu¬ 
lation  of  an  8x8  8/E+  network  are  presented.  In  chapter  5 
the  model  is  used  to  investigate  reliability  enhancements 
as  they  effect  the  ICN  bandwidth.  In  chapter  6  a  summary 
is  presented  and  some  topics  for  further  research  in  this 
area  are  discussed. 


II. 


FAILURE  DEPENDENT  BANDWIDTH  IN 


SHUFFLE  EXCHANGE  NETWORKS 

A  model,  which  can  be  used  to  predict  the  failure 
dependent  bandwidth  of  S/E  networks,  composed  of  binary 
crossbar  switches,  in  the  presence  of  either  address  or 
data  mode  faults  in  these  switches,  is  now  presented.  For 
the  purposes  of  this  paper,  bandwidth  is  defined  as  the 
average  number  of  active  connections  which  are  simultane¬ 
ously  supported  by  the  network.  The  purpose  of  this  model 
is  to  provide  reasonable  estimates  of  the  expected  bandwith 
as  a  function  of  failure  parameters  associated  with  the 
component  switches  of  the  ICN.  The  model  can  then  be  used 
to  estimate  the  cost  effectiveness  of  reliability  enhance¬ 
ments  as  related  to  the  ICN  bandwidth. 


S  at  X 


S  at  T 


Figure  7  stuck  at  Faults 
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Binary  Crossbar  Fault  Model 

A  static  fault  model,  which  classifies  all  faults  in 
the  ICN  as  either  address  mode  or  data  mode  faults  in  the 
binary  crossbar  switches,  will  be  used.  The  address  mode 
fault  model  has  been  used  by  other  researchers [ 3 , 23 ]  and 
can  model  a  large  number,  though  certainly  not  all,  of  the 
faults  possible  in  an  S/E  network.  The  address  mode  fail¬ 
ure  model  states  that  the  switch  fails  in  one  of  two  pos¬ 
sible  configurations.  These  are  shown  in  figure  7  and 
represent  a  condition  in  which  the  switch  is  frozen  in  one 
of  its  two  possible  configurations.  As  a  result,  the 
switch  no  longer  responds  to  input  requests  but  rather 
routes  inputs  to  outputs  in  a  fixed  pattern.  Given  that  an 
address  mode  failure  has  occurred,  it  is  assumed  that 
either  failure  configuration,  S  at  T  or  S  at  X,  is  equally 
likely.  Data  mode  faults  result  in  the  switch  being  unable 
to  correctly  transmit  any  information.  It  is  certainly 
possible  for  a  switch  to  undergo  a  data  mode  failure  subse¬ 
quent  to  an  address  mode  failure.  In  this  case  it  is 
classified  as  a  data  mode  failure.  Thus,  at  all  times,  the 
total  probability  of  failure  is  equal  to  the  probabilty  of 
an  address  mode  failure  plus  the  probability  of  a  data  mode 
failure.  Further  it  is  assumed  that  the  failure 
probabilities  for  the  switches  are  known  or  can  be 


calculated. 
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Processor  Input  Assumptions 

Analysis  of  these  networks  is  based  on  the  following 
assumptions : 

1)  The  system  contains  N  =  2k  processors  and  N  =  2k 
memory  modules  that  are  statistically  identical  in 
each  group. 

2)  The  processors  and  memories  are  connected  by  either 
a  k  column  S/E  network  or  a  k+1  column  S/E  network 
as  described  previously. 

3)  All  two  input,  two  output  routing  switches  are 
statistically  identical.  Only  the  internal  path 
conf igurations  described  in  fig.  5  are  allowed. 

4)  Circuit  switching  is  used.  A  processor  is  held  in 
a  wait  state  if  a  requested  access  cannot  be  com¬ 
pleted  . 

5)  Conflict  resolution  at  each  routing  switch  is  un¬ 
biased.  Given  that  two  conflicting  requests  are 
present  at  the  inputs  to  a  switch,  each  request  has 
probability  1/2  of  being  satisfied. 

6)  Memory  requests  issued  by  each  processor  are  inde¬ 
pendent  and  are  uniformly  distributed  over  the  N 
memory  modules. 

7)  In  the  case  of  the  S/E+  network,  each  processor 
maintains  a  single  bit  history  of  the  accesses  made 
to  each  memory  module.  This  history  specifies 
whether  an  error  has  occurred  during  an  attempted 
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access  to  that  memory.  If  no  error  has  occurred 
then  the  processor  requests  a  T  setting  of  the 
first  column  switch  for  all  accesses  to  that 
memory.  If  an  error  has  occurred  during  an  access 
to  that  memory  then  the  processor  will  request  an  X 
setting  in  the  first  column  switch  for  all  subse¬ 
quent  accesses  to  that  memory. 

8)  During  each  cycle  each  processor  will  submit  a 
request  with  probability  designated  by  m0. 

9)  Error  detection  is  perfect.  Only  those  memory 
accesses  that  are  correctly  routed  reach  the 
desired  memory  and  thus,  only  these  are  considered 
in  the  bandwidth  computation. 

10)  Failures  among  the  switches  that  comprise  the  net¬ 
work  occur  at  a  very  low  frequency  with  respect  to 
processor  memory  requests. 

These  assumptions  are  necessary  to  make  the  model 
tractable.  The  limitations  they  impose  in  relation  to 
actual  systems  should,  however,  be  understood.  Assumptions 
6  and  8  imply  that  blocked  requests  are  not  resubmitted 
during  the  next  cycle  but  rather  axe  ignored.  Simulations 
performed  by  others [ 22 , 23 , 37 , 38 )  indicate  that,  for  fault 
free  systems,  this  assumption  does  not  significantly  alter 
the  results.  These  assumptions  also  imply  that  processors 
continue  to  attempt  to  access  memories  that  cannot  be 
reached  by  reason  of  multiple  failures.  It  is  reasonable 
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to  assume  that  some  system  reconfiguration  will  occur  after 
a  failure  is  detected  and  that  this  reconfiguration  will 
restrict  the  set  of  memories  that  may  be  accessed  by  a 
processor  to  those  that  can  be  reached.  The  amount  of 
error  induced  by  this  assumption  is  not  known  but  should  be 
small  for  times  where  the  probability  of  failure  is  rea¬ 
sonably  low.  For  times  where  the  probability  of  failure 
for  an  individual  switch  is  high,  the  bandwidth  derived 
here  should  be  considered  a  lower  bound. 

Assumption  9  states  that  error  detection  is  perfect. 
It  is  not  reasonable  to  expect  perfect  error  detection  in  a 
faulty  network.  The  probability  that  an  undetected  error 
occurs  can  be  reduced  by  employing  both  hardware  error 
detection  and  periodic  software  testing.  Undetected  errors 
will  decrease  the  effective  bandwidth.  The  model  treats 
undetected  errors  as  successful  accesses  and  thus  over¬ 
estimates  slightly  the  effective  bandwidth. 

Assumption  10  implies  that,  for  the  S/E+  network,  the 
processor  can  be  assumed  to  know  whether  an  error  has 
occurred  on  the  primary  path  (T  connection  of  first  stage 
switch)  prior  to  making  the  request.  This  is  equivalent  to 
ignoring  the  requests  that  actually  discover  the  error  in 
the  bandwidth  calculations.  If,  on  average,  many  accesses 
occur  between  failures  this  will  have  a  negligible  effect 
on  the  bandwidth.  This  should  be  true  for  all  practical 
systems . 


The  basic  S/E  network  has  only  a  single  possible  path 
for  each  processor  memory  pair.  As  a  result  of  this,  the 
independence  assumption  for  the  processor  requests  and  the 
statistical  independence  of  the  component  switches,  the 
events  that  the  two  inputs  to  any  given  switch  are  active, 
are  statistically  independent.  This  follows  since  the 
probability  that  an  input  is  active  is  a  function  of  the 
set  of  processors  that  could  have  generated  the  input  and 
the  set  of  switches  that  could  have  processed  the  input 
prior  to  the  switch.  For  the  two  inputs  of  any  given 
switch  in  the  S/E  network  the  sets  of  processors  and 
switches  that  can  effect  one  input  are  disjoint  from  the 


figure  8  which  shows  the  sets  of  processors  and  switches 
that  can  effect  the  inputs  to  the  first  switch  in  the  third 


column . 
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The  bandwidth  of  the  system  can  be  determined  by 
calculating  the  probability  that  a  single  output  line  is 
active.  By  symmetry  this  probability  is  the  same  for  all 
output  lines.  The  expected  number  of  active  connections 
then  is  the  bandwidth  and  is  equal  to  N  times  the  proba¬ 
bility  that  a  single  line  is  active.  The  probability  that 
an  output  line  is  active  can  be  be  determined  by  k  repeated 
evaluations  of  the  probability  that  a  stage  output  line  is 
active  given  the  probability  that  its  input  lines  are 
active. 


A1 — 
A2 — 


—  B1 

—  B2 


Pigure  9  -  Switch  I/O  notation 

Figure  9  illustrates  the  notation  that  will  be  used 
in  the  development  of  these  equations.  The  inputs  to  a 
switch  are  labeled  A  and  the  outputs  are  labeled  B.  The 
event  that  the  upper  input  is  active  is  denoted  by  A1  and 
the  event  that  the  lower  input  is  active  by  A2.  The  event 
that  the  upper  input  is  active  and  requests  the  upper 
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output  is  denoted  by  Au.  The  event  that  the  upper  output 
is  active  is  denoted  by  B1.  Thus  A22  is  the  event  that  the 
lower  input  is  active  and  is  requesting  the  lower 
output (address  bit  is  a  1). 

The  probability  that  an  output  line  from  a  given 
switch  is  active  can  be  expressed  as  follows: 

P(BX)  =  PCBtl  no  fail)  P(no  fail) 

+  P (  Bj  I  address  fail)  P(addeeaa  fail) 

+  P<B1ldata  fail)  P<data  fail) 

where  the  failure  event  probabilities  are: 

P< address  fail)  =  p, 

P{data  fail)  =  pd 
P<  f  ail )  »  p,+pd  =  pf 

Now  the  probability  that  an  output  is  active,  given  that 
the  switch  has  failed  in  address  mode,  can  be  expressed  as 

P{ B} I  address  fail)sP(B1IS  at  X)  P(S  at  Xladdr  fall) 

+  P<BXI9  at  T)  P ( s  at  Tl addr  fail) 

and  since,  given  a  failure  either  mode  is  equally  likely, 
then 

P( 3  at  Xladdresa  fail)  a  p(s  at  Tl address  fail)  ^  0.5 
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For  the  purposes  of  the  model  it  is  assumed  that  any 
request  that  is  misrouted  as  a  result  of  a  failure  of  a 
switch  is  blocked  in  that  switch.  In  reality  this  would 
probably  be  detected  in  the  next  stage.  Since  perfect 
error  coverage  has  been  assummed,  the  request  will  be 
blocked  in  the  network.  In  addition  it  will,  with 
probability  one,  be  detected  at  the  next  unfailed  stage  to 
which  it  is  incorrectly  routed  and  then  discarded.  As  a 
result  it  will  not  effect  arbitration  and  therefore 
blocking  at  an  operational  switch.  Thus  there  is  no  loss 
in  generality  by  assuming  that  the  request  is  blocked  at 
the  stage  in  which  the  error  occurs.  Given  this,  then 

PtBjiS  at  T)  =  0 . 5P{ A! ) 

PJBjIS  at  X)  =  0 .  5  P  ( Aa ) 

P<B1idata  fail)  =  0 

Let  us  represent  the  probability  that  an  input  is 
active  by  min.  And  by  symmetry 


P{Ai)  =  P{ Aj )  =  mln 
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Then  the  probability  that  the  upper  output  is  active  can  be 
written  as 

P<8X)  *  P(Bxlno  fail)(l-pf)  +  0.5mlnp, 

Now 

P{8xlno  fail)  =  P  { AXXUA2X ) 

That  is,  the  output  will  be  active  if  either  input  is 
active  and  requests  that  output.  Then 

P  (  AXXUA2X  )  =  P{AXX)  +  P(A2X)  -  P(Axxr\A2X) 

Now,  because  the  requested  memories  are  independent  and 
uniformly  distributed,  so  also  are  the  control  bits  at  each 
switch  and 
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Combining 

P<B1)  =  (mio-0.2  5mln,)<l-pf)  +  O.Sm^p, 

This  formula  can  be  evaluated  k  times  using  m0  as  min 
in  the  first  evaluation  and  P(B)  for  stage  i  as  mio  for 
stage  i+1 .  This  will  give  the  probability  that  a  last 
stage  output  line  is  active.  Notice  that  this  formula 
reduces  to  that  derived  by  Patel[23]  when  the  pf  is  zero. 
Figure  10  is  a  plot  of  the  bandwidth,  normalized  by  N,  of 
an  S/E  network  as  a  function  of  the  probability  of  address 
mode  failure  for  several  values  of  the  probability  of  data 


mode  failure. 
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Pigure  10a  -  normalized  Bandwidth  vs  Pa  for  Pd=0 
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Figure  10b  -  Normalised  Bandwidth  vs  P,  for  Pd= 


Figure  10c  -  normalized  Bandwidth  vs  P,  for  Pd=0.2 
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ANALYSIS  OF  THE  S/E+  NETWORK 


The  analysis  of  the  S/E+  network  provides  a  great 

deal  more  challenge.  The  situation  is  complicated  by  two 

facts.  First  the  extra  stage,  which  is  refered  to  as  stage 

0,  behaves  differently  than  the  stages  of  the  S/E  network. 

Its  control  bits  are  not  independent  and  uniformly 

distributed,  as  are  the  control  bits  in  the  S./E  network, 

but  rather  are  dependent  on  the  state  of  the  network. 

Next,  since  the  setting  of  the  first  stage  switches  depends 

on  the  failure  state  of  the  remainder  of  the  network,  it 

•  * 

cannot  be  assumed  that  the  events  that  the  two  input  lines 
to  a  switch  in  column  1  to  k  are  active  are  independent. 

The  approach  that  will  be  used  is  to  first  calculate  the 
probability  that  the  output  lines  from  stage  zero  are 
active  and  then  show  that  the  inputs  to  any  first  stage 
switch  (the  stage  following  the  added  stage)  are 
essentially  independent.  Once  this  has  been  established , 
much  of  the  above  analysis  can  be  used  on  the  last  k  stages 
of  the  S/E+  network. 

Probability  q£  Active  stags.  P -Output  g 

Let  B10  represent  the  event  that  the  upper  output  of 
a  column  0  switch  (the  added  column)  is  active.  The 
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probability  that  this  line  is  active  will  now  be 
calculated.  As  in  the  case  of  the  S/E  network,  first 
condition  on  whether  the  column  0  switch  has  failed  or  is 
operational.  Thus 

P( B10)  =  P{  B1# I  no  fail)  Pino  fail) 

+P { B1# I  address  fail)  Pfaddress  fail) 

+P<B10idata  fail)  P(data  fail) 

As  before 

P{B10ldata  fail)  =  0 

Now,  given  that  the  column  0  switch  has  failed  in  the 
address  mode,  again  condition  on  the  configuration  of  the 
failure.  This  time,  however,  there  is  no  basis  on  which  to 
delete  any  requests  which  pass  through  a  column  0  switch. 
This  follows  since,  if  there  are  no  other  failures  along 
the  path  from  that  column  0  output  to  the  requested  memory, 
the  memory  can  be  accessed  from  either  output  of  the  column 
0  switch.  Thus,  address  mode  failures  in  column  0  do  not 
effect  addressability  unless  they  are  coupled  with  other 
failures.  Requests,  which  are  blocked  due  to  other 
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failures,  will  be  accounted  for  when  those  failures  are 
treated.  Thus 

P{ B10l  address  fail)  =  P<B19I3  at  X)  P<  3  at  Xladdress  fail) 

+P  ( B10 1  3  at  T)  P<3  at  T  l  address  fail) 
=  0.5P<A2)+  0.5P(AX) 

=  «o 

Where  m0  is  the  probability  that  a  processor  submits  a 
request  in  a  given  cycle.  Thus 

P(B1#)  =  P(  B10 1  no  fail  column  0){l-pf)  +  moP, 

Now  let 

P' (event)  =  P(eventl  no  fail  column  0) 

Then 

P(  B10 1  no  fail  column  0)  =  P‘(B10)  =  P'  (AnUA^) 

and 

P‘<AuUAn)  -  P '  (Axl)  +  P'(A21)  -  PU^OAjj) 

P'tAxj)  =  P' <A11IA1)P<A1)  =  P’  (A^IA^m, 

P  (A2X)  ~  P  t  Ajj  I  Aj)  P{  Aj)  —  P  ^AjjIAjJbi^ 

P'  (AunA2l)  =  P’ {A^nAjitA^nAjJm,* 
as  before.  Now  let 


P''{ event)  *  P ' ( event l AxnA2) 

That  is  P’ 'of  an  event  is  the  probability  of  that  event 
given  that  the  two  inputs  of  interest  are  active  and  that 
the  column  0  switch  has  not  failed. 


Now  given  that  Ax  is  active,  the  event  AX1  will  occur 
only  if  there  is  no  error  along  the  primary  path  (the  one 
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selected  by  the  T  setting  of  the  column  0  switch)  for  the 
Ax  input  and 
P'  (A^iAi)  = 

P '  '  <  AX1 )  =  P(  so  error  exists  in  the  selected  primary  path  columns  1  to  k) 

-  <l-O.Sp.-pd)k 

Given  that  A2  is  active,  the  event  A21  will  occur  only  if 
there  is  at  least  one  error  on  the  primary  path  for  the  A2 
input.  Thus 
P*  <AMI  A2)  = 

P‘  (Aal)  - 

P ( at  least  one  error  exists  in  the  selected  primary  path  columns  1  to  k) 

-  l-<i-0.Sp.-pd)k 

To  calculate  the  probability  of  the  intersection  of 
events  Au  and  A21/  condition  on  whether  the  At  and  the  A2 
primary  paths  share  a  last  column  switch.  Note  that  they 
cannot  share  any  switches  in  columns  1  to  k— 1  or  there 
would  be  more  than  two  paths  from  any  input  to  output  of 
the  S/E+  network.  This  fact  is  easily  proved.  Thus 
P‘  ’(AjiOAjj)  s»  P  1  '  { A11OA21 1  share  laat  column  switch)  (2/N) 

+P  ‘  ‘  ( A11OAJ1 1  last  column  switch  not  shared )( 1-2/N) 

Where  2/N  is  the  probability  that  the  two  inputs  request  a 
pair  of  memories  that  must  be  accessed  through  the  same 
last  column  switch. 
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Now  if  the  two  input  requests  are  not  for  memories 
which  require  the  sharing  of  a  last  column  switch,  the  two 
paths  are  independent  and 

P'  '  {  A11r\A11 1  last  col  switch  not  shared)  =  P‘  1  {  An )P  '  '  { Aal ) 

=  <l-0.Sp.-pd)k[l-(l-0.5p,-pd)k] 

In  order  to  calculate  the  required  probabilities 
given  that  a  last  column  switch  is  shared  condition  on 
whether  the  shared  switch  is  failed.  Thus 

P‘  *  ( AjjflAjj  I  shsce  last  col  switch)  » 

P  ‘  1  (A11HAJ1 1  share, shared  switch  unfailed )(  l-pf  ) 

+  P‘  ‘  (A11HA21 1  share, shared  switch  addr  fail)pa 
+  P'  '( A11OAj1 1  share, shared  switch  data  fail)pd 

and,  if  the  switch  has  not  failed,  it  cannot  cause  an  error 
therefore 

P  '  '  ( A11riA21 1  share  unfailed)  =  { 1-0  .  5pa-p^  )k  1 [ l-{ 1-0 . 5pa-pd  )k  *] 
Given  that  the  shared  switch  has  failed  in  the 
address  mode,  again  condition  on  whether  the  failure  causes 
an  error  for  the  A2  primary  path.  Note  again  that  it  must 
not  cause  an  error  for  the  Ax  path  as  the  probability  of 
the  event  is  0  if  there  is  an  error  on  that  primary  path. 
Thus 

P'  '  |AnnA„  I  share,  address  fail)  = 

t'  '  (AjjAA,,  i share,  addr  fail,  error  A,)  P'' (error  I  share,  addr  fail) 

4  P' ' (AnAA„  Ishare,  addr  fail,  no  error)  P''(no  error  I  share,  addr  fail) 
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Now  given  that  a  switch  has  failed  in  the  address 
mode  and  has  two  active  inputs,  one  of  four  equally  likely 
conditions  holds  with  respect  to  errors  for  those  inputs. 

Of  interest  are  the  two  of  these  that  do  not  produce  an 
error  for  the  Ax  input.  Thus 

P"' (error  A,n  no  error  A1laddr  fail)  =  0.25 

P''(no  error  A,n  no  error  A1laddr  fail)  =0.25  a 

Now,  if  a  switch  is  failed  in  the  address  mode  and  is 
shared  in  the  last  column,  but  produces  no  error,  then  it 
is  the  same  as  if  it  were  functioning  and 

P’  1  (A„rtAjt  I  share,  addr  fail,  no  error)  »  ( 1-0 . 5pd-pd  l*"1  (  1~(  1-0 . 5pa-pd)k_1  ) 

If  it  produces  an  error  for  the  Az  path  but  not  for  the  Al 
path  then  the  probability  that  at  least  one  error  occurs  in 
the  A2  path  is  one  and 

P  ''(  A11nA21 )  I  shared,  failed,  error)  «=  {  1—  0 . 5p,-pd )  k_1 

Combining  these  equations  will  allow  the  calculation  of  the 
probability  that  one  of  the  output  lines  from  a  column  0 
switch  is  active.  By  symmetry  this  value  is  the  same  for 
all  lines. 

Joint  Probability  of  Active  Stage  0  Outputs 

The  above  derived  probability  is  not  sufficient.  It 
would  be  convenient  to  apply  the  results  of  the 
calculations  done  for  the  S/E  network  by  using  the 
probability  that  a  column  0  output  line  is  active  as  the  m0 
in  the  equations  derived  for  the  S/E  network,  using  that  to 
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represent  the  remainder  of  the  S/E+  network.  Recall, 
however,  that  these  equations  were  derived  under  the 
condition  that  requests  arriving  at  the  two  input  lines  are 
independent.  This  is  not  the  case  for  the  switches  in 
column  1  of  the  S/E+  network.  It  is  the  contention  of  this 
work,  however,  that  they  are  approximately  independent. 
Given  this,  the  assumption  that  they  are  independent  will 
induce  only  small  errors  in  the  model.  To  prove  that  the 
dependence  is  indeed  small,  first  calculate  the  joint 
probability  that  the  two  inputs  to  any  column  1  switch  are 
active  and  then  calculate  the  covariance  of  indicator 
random  variables  for  each  of  these  lines.  The  covariance 
is  given  by 

covar  =  P<B?0nB{:0)  -  pj<b10) 

where  b”0  and  B^0  represent  the  events  that  the  two  outputs 
from  stage  0  switches,  which  are  the  inputs  to  a  particular 
column  1  switch,  are  active.  Now  P(Bj0nBj0)  is  calculated 
First  examine  figure  11.  The  highlighted  switches 
illustrate  an  important  relationship.  Notice  that  if  any 
column  1  switch  is  picked  and  its  input  lines  traced  back 
to  their  respective  column  0  switches,  the  other  output 
lines  from  the  column  0  switches  both  terminate  at  the  same 
column  1  switch.  Thus,  to  calculate  the  probabilities  that 
two  inputs  to  the  same  column  1  switch  are  jointly  active, 
two  sets  of  paths  must  be  considered.  These  sets  of 
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paths  enter  column  1  through  the  two  highlighted  switches. 
In  one  set  the  probability  that  no  errors  exist  must  be 
determined.  In  the  other  set,  the  probability  that  at 
least  one  error  exists  in  each  of  the  paths  in  the  set  must 
be  calculated.  As  shown  in  figure  11  the  upper  column  0 
switch  is  marked  with  a  U.  All  quantities  relating  to  that 
switch  will  be  superscripted  with  a  U,  while  all  quantities 
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relating  to  the  lower  switch  will  be  superscripted  with  an 
L.  Thus  P(Bj0nB^0)  must  be  calculated.  Now 

P{B^#nB®0)  =  P  { 1  upper  lower  col  0  not  failed )(  i-pf ) 2 

+  P{  B^0OB^gl  upper  addr  fail  lower  not  failed  >U-Pf)P. 

4-  P  {  Bj0nB°0 1  upper  not  failed  lower  addr  fail )  {  1— pf  )  pa 

+  P{ B^0OB^0I  upper  and  lower  addr  fail)p,2 

+  P{  B^aDB°0l  upper  data  fail  lower  not  failed  ){  1— pf  )  pd 

+  P<  B^OB^I  upper  not  failed  lower  data  fail )  {  l-pf  )  pd 

+  P{ bJ8DB*0I  upper  and  lower  data  fail)pd2 

4-  P<  B^0DB^0I  upper  data  fail  lower  addr  failed)papd 

+  P  {  B^0DB^0 1  upper  addr  failed  lower  data  fail)papd 

The  last  five  terms  in  the  above  equation  represent 
one  or  more  data  mode  failures  in  the  column  0  switches 
which  form  the  input  the  selected  column  1  switch.  Given 
that  a  data  mode  failure  has  occurred  in  one  of  these 
switches  the  probability  of  B^DB^  is  0.  Therefore  these 
terms  will  be  disregarded  in  the  following  dicussion. 

Now  redefine  P'  of  an  event  as  the  probability 
of  the  event  given  that  the  stage  0  switches  that  affect 
the  event  are  functioning.  This  retains  the  previous 
definition  and  includes,  for  events  that  depend  on  two 
column  0  switches,  the  following 

P' (event)  =  P(event  t  upper,  lower  col  0  not  failed) 
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Then 

PO^VlBio)  -  p  '  {BianB°0){  i-pf) J 

+P ' {  B^0 )m0P,(  1-P{  ) 

+P  '  {Bi0)m8p,{  l-pf  ) 

2  2 

+m0  p. 

Now  by  symmetry 

p'(b“0)  =  p'{bJ;0)  =  P'{B10) 

which  was  derived  previously.  Thus  the  first  term  of  the 
above  equation  is  the  only  new  value  which  must  be  eval¬ 
uated.  It  can  be  expressed  as  follows: 

p* < Bi-0r»B"0)  -  p  <b‘0)  +  p '  ( bJj )  -P'  (B^UB®,,) 

Again  the  first  two  terms  of  this  equation  have  been  cal¬ 
culated.  The  last  can  be  expressed  as 
p  •  (  bJ0ub“0  )  -  p  ■  ( [  A^UA^  ]U{  aJ^UAjj  ] ) 

-  1  -  p-  r  (A^UA^UA^UA^)  ) 

-  I  -  P'  {[A”jU-Ai]n[A5jU*A,]  ntAjjU-AjlntAijU-Aj)  ) 
where  the  final  form  follows  from  the  repeated  application 
of  DeMorgan ' s  law  and  the  fact  that  the  complement  of  the 
event  A22  is  the  event  that  the  request  is  for  the  other 
output  or  that  there  is  no  request  at  all.  Expansion  of 
this  term  will  result  in  the  union  of  16  mutually  exclusive 
events.  The  probability  is  then  just  the  sum  of  the 
probabilities  of  each  of  the  events.  Now  the  probability 
of  each  of  these  16  events  must  be  evaluated. 
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The  simplest  is  the  case  where  no  inputs  are  active. 
This  term  is  given  by 

f{'A”n*A|n-Ain'Aj)  =  (1-m,,)4 

Next  the  four  terms  in  which  only  one  input  is  active 
must  be  considered.  These  are  easily  evaluated  in  terms  of 
previously  calculated  quantities.  They  are 
P  1  {  * A™fr  AjH" AjHAj a )  =  P'  (Aj,IA£)m0(l-m0)3 

=  {1-0 .5p,-pa  )km0(  l-m^'3 


P  '  (  *  AjO"  AjhAijO*  aJ  )  =  P'  {Ai2IA^)m6{l-m0)3 

-  I  1— { 1—0 . 5pa— p,j  )k)mtt(  1— £%)  3 
p- {-A“nA“jrrAjn-Ai)  =  <1-0  .Sp.-p^niod-mo)3 

p'  {Aj'jrrAjrrAj’n-Aj)  =  [i-{ l-o. 

Next  the  six  terms  that  involve  two  active  inputs 
must  be  evaluated.  The  first  of  these  is 

P'  (-Ajh'AjnA^nA^j)  -  P- (A^jnAjjlAjnAiJmo^l-m,,)1 

=  p-  ■  (AijOA^Jino^l-mo)2 

where  P1'  of  an  event  has  been  redefined  as  the  probability 
of  that  event  given  that  all  switches  in  column  0  which 
effect  the  event  are  functional  and  that  all  inputs  that 
are  necessary  for  the  event  are  active.  Thus 
P' ' { event)  - 


P' (event  I  all  inputs  required  for  the  event  are  active) 
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Then  by  symmetry 

P'  ‘  =  P,,(A11nAal) 

which  has  been  evaluated. 

The  next  term  to  be  evaluated  is 

p‘ -  ?•  *  (Aj^nA^m^a-m,,)1 

This  term  has  not  been  seen  before.  Here  the  two  primary 
paths  both  of  which  use  the  same  column  1  switch  must  be 
considered.  Since  they  both  pass  through  the  same  column  1 
switch  they  can  share  any  number  of  switches  from  1  to  k. 
Here,  as  before,  though  not  explicitly  stated  previously, 
share  is  in  the  sense  that  both  attempt  to  use  the  same 
switches.  Obviously  they  may  not  be  able  to  do  this  simul¬ 
taneously.  Only  the  probability  that  errors  do  not  occur 
on  either  of  these  paths  is  of  interest.  In  other  words, 
given  no  conflicts,  that  both  accesses  can  be  made.  To 
evaluate  this  term  first  condition  on  the  number  of 
switches  shared  between  the  two  paths.  Thus 
P‘  1  (A^riAJa)  = 

E^P '  '  {  aJjOAjj  I  path*  share  j  switches ) P{ share  j  switches) 

The  probability  that  j  switches  are  shared  is 
P< share  j  switches)  -  1/2^  j<k 

2/H 

This  can  easily  be  seen  by  realizing  that  the  ID  of  the 
memory  requested  determines  the  path  and  therefore  the 
number  of  shared  switches.  The  memory  ID  is  a  k  bit  binary 
number.  The  two  paths  will  share  exactly  j  switches  if  and 
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only  if  these  two  numbers  match  in  exactly  j— 1  places. 

Since  the  individual  binary  digits  in  each  number  are 
independent  and  uniformly  distributed  the  probability  is  as 
given  above. 


COL  0  1  2 


Figure  12  —  Exactly  One  Switch  Shared 

Figure  12  shows  an  example  of  the  case  where  exactly 
one  switch  is  shared  by  the  two  paths.  Notice  that  if  this 
occurs,  this  event  can  occur  only  if  that  shared  switch  has 
not  failed,  or  ,if  it  has  failed  in  an  address  mode,  the 
failure  matches  both  request  given  that  the  request  are 
known  to  be  different;  and  then  if  no  other  errors  are 
caused  by  the  remaining  k— 1  switches  on  each  of  the  two 
paths.  Thus 

p'  '  <A^2nAj2i  j*l)  =  (l-o.5p.-p,,)  (1-0. 5p.-pa)k_l  { l-o. 5P.-P*) 1,-1 
-  (l-0.5p.-pd)2k_l 

Figure  13  shows  an  example  of  the  case  where  there 
are  3  shared  switches  and  k>3.  Here,  for  the  event  to 
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occur,  both  the  first  and  last  shared  switch  must  be 
functional.  The  control  digits  for  the  switches  between 
the  first  and  last  must  be  the  same  and  therefore  these 
switches  will  not  produce  an  error  for  either  path  if  they 
do  not  produce  an  error  for  one  of  the  paths.  Then  the 
remaining  switches  along  both  paths  must  not  produce  any 
errors.  Thus 

p‘  '  (A^anAi,i  l < j  <k )  » 

*  (l-pf)  <l-0.Sp,-pd)J_2  { l-p£ )  {l-D-Sp.-pt,)11'1  U-0.Sp„-pd)kH 
-  <l-pf)2  <l-0.Sp,-pd)2k_:l"2 


Figure  13  Exactly  3  Shared  Switches 

Finally,  the  case  where  k  switches  are  shared  must  be 
evaluated.  Here  again  the  first  shared  switch  must  be 
functional.  Then  the  next  k-2  switches  must  not  produce  an 
error  for  the  paths,  given  that  the  control  digits  are  the 
same  for  both  paths.  Finally  the  last  switch  must  not 
produce  an  error  for  either  path.  If  the  switch  has  failed 


45 


in  the  address  mode  this  can  only  occur  if  both  paths 
request  the  same  memory  and  the  switch  has  failed  so  that 
the  connection  is  possible.  Thus 

P'  •  {A^OaJjI  j«=k)  -  <l-pf)  <l-D.Sp,-pd)k~*  <l-0.7Sp.-pd) 

The  next  term  to  be  evaluated  contains  two  secondary 
path  requests.  This  implies  that  at  least  one  error  exists 
in  each  of  the  primary  paths.  The  desired  term  is 

p- (Aijn-AjnAjyVA^)  -  P'-tAijOAi,)  m^d-Ho)2 
The  primary  paths  associated  with  these  requests  enter  the 
same  column  1  switch.  The  desired  probability  then  can  be 
derived  from  the  last  calculated  term  as  follows: 

p'  -  L  -  P'-^U^) 

=  1  -  P''UiX)  -  P "(A^)  +  P'  * 

-  1  -  2{l-0.5p,-pd)k  +  P  -  ' 

where  the  last  step  follows  from  the  symmetry  involved. 

The  final  three  terms  with  two  active  inputs  can  be 
shown  by  symmetry  to  be  equal  to  terms  previously  derived. 
Thus 

P‘  Ci^rrAjnAjjnA^j)  -  p-  r  A®nAj2nA{,rrAj) 

-  p*  {Awn-AjO-A^nA^,) 

=  p 1  ( A^nA^rr  aJo-  a^) 

Next,  the  equations  for  the  four  terms  with  three 
active  inputs  must  be  derived.  The  first  of  these  is 

P '  {  ‘  A^nAjj/TAjj/TAjj )  *  P'  1  (  AjjOAjjDA^j  )  jn*1  < l-m*) 
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Figure  14  Example  of  a  Three  Input  Term 

Figure  14  shows  a  typical  example  of  one  of  the  sets 
of  paths  which  are  the  events  in  this  term.  Notice  that 
there  are  two  primary  paths  which  come  from  the  A2  inputs 
in  which  there  can  be  no  errors,  and  one  primary  path  which 
comes  from  the  input,  in  which  at  least  one  error  must 
occur.  The  two  paths  from  the  A2  inputs  may  share  1  to  k 
switches  with  each  other  and  finally  may  share  a  column  k 
switch  with  the  path  from  the  Ax  input.  The  probability 
of  this  event  can  be  calculated  by  conditioning  on  the 
number  of  shared  switches  in  the  A2  paths  and  then  on 
whether  a  column  k  switch  is  shared  with  the  Ax  path. 

First  define  the  following: 

B  *  [^22^^12^22^ 

Then,  conditioning  on  the  number  of  shared  switches  in  the 
A3  paths, 

P  ’  '  ( A^'k*  jOAjj )  *  Ej^jP  ‘  '  (E  share  J  switches.)  P{ share  j.) 


Where  the  probability  that  the  A2  paths  share  j  switches 
was  previously  derived. 

Now  define 

P‘‘^(B)  =  P’  ‘  (  El  aJj,  share  )  switches. ) 

Finally,  condition  on  the  sharing  of  a  column  k 
switch . 

P‘  ‘^(E)  —  P*  1 ^  {  E I  A2j  paths  disjoint  from  A,  primary  path)  P(disjoint) 

+  P*  ‘j(EIA22  share  switch  with  At  primary  path)  P( share) 

If  the  A2  paths  share  fewer  than  k  switches  they 
access  two  of  four  possible  memories.  If  the  A2  path 
accesses  one  of  those  four  memories,  it  will  share  a  last 
column  switch  with  the  A2  paths.  Thus 
P< share)  -  4/N  j<k  *  ' 

=  2/N  J=k 

P{ disjoint)  =  1-4/X  j<k 
=  1— 2/N  j=k 

and,  if  a  last  column  switch  is  shared  by  the  and  A2 
paths,  condition  on  whether  that  switch  has  failed. 

P  ’  1  ^  (  B  I  share.  )  -  P  ’  ’  ^(E  I  share,  shared  not  failed  .){ I— pf  ) 

+  P '  ‘  *  ( E I  share,  shared  addr  fail.)pa 
+  P  ’  1  ^  ( B I  share,  shared  data  fail.)pd 

If  it  has  failed  in  the  address  mode  condition  on  whether 
it  produces  an  error  for  the  Ai  primary  path.  Thus 
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P  1  ‘ ^  (  B I share,  addr  fail) 

—  P'  '^{Elshc  addc  failed,  no  ecr  Aj  pcimacy  path)  P  { no  errladdc  fail  ahc ) 
+  P'  ‘^{Elshr  addc  failed,  ere  A*  pcinacy  path)  P(arrladdr  fail  ahc) 

Now  given  that  the  As  paths  and  the  Ax  path  do  not 
share  a  final  column  switch ,  they  are  independent  and,  for 
the  case  where  the  Aa  paths  share  1  switch, 

P'  ‘  *{B I  no  ahr) 

-  ( 1-0  .  Sp,-pd  )( 1-0 .  Sp.-pd  )k_1  { 1-0 . 5p.-pd  )k_1  [  l-{  1-0 .  Sp.-pd  )k] 

-  <l-0.5p,-pd),k-1[l-(l-0.5p,-pd)k] 
por  i<j<y 

P‘‘*(Elno  shr,l<j<k) 

- <  1-Pf  ) 2 { 1- 0  •  Spd-pd )*k~J~2[  i_(i_o  .5 p,-pd  ) k ] 
and  for  j=k 

P1  ,k(E  l  no  shr)-(l-pf)  { 1-0 . 5p.-pd)k_a(  1-0 . 75p,-pd )  1 1-(  1-0 . 5p,~pd  )k) 
Now  if  they  share  a  switch  but  that  switch  is  func¬ 
tional,  it  cannot  produce  an  error  and  thus,  for  the  case 
where  there  is  one  shared  switch  in  the  A2  paths, 

P'  'Ne  i ahr,  no  fail) 

-  ( 1-0  .Sp,-pd)  ( 1-0 . Sp,-pd)k_a< 1-0  .  Sp.-pd  )k_1t  l-{  1-0 . 5p.-pd  )k_1] 

-  { 1-0 . 5p,-pd ) ak_a  [  l-U-0 .  Sp.-pd ) k~l  ] 

P''^ (El ahr,  no  fail,  l<J<k) 

»(  l-pf  )a<  1-0. 5p.-pd)ak"J_,[  l- (1-0 . 5p.-pd)k-1] 


P’ ,k(Elahr,  no  fail)  -  ( 1-p* )  ( 1-0 .  Sp^-p*;  )k_a(  l-<  1-0  .  Sp,^)*1'1 ) 
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If  a  column  k  switch  is  shared  and  has  failed  but 
produces  no  errors  then  it  is  as  if  the  switch  had  not 
failed  and 

P '  ‘  ^ (E I  share,  addr  fail,  no  error)  *  P  '  ' * < B I  share,  not  failed) 

Now  if  the  shared  switch  has  failed  in  the  address  mode  and 
produces  an  error  for  the  A1  path  then  the  probability  of 
at  least  one  error  on  that  path  is  one  and 
P'*1{Elshr,  addr  fail,  err  Ax) 

=  { 1-0 . 5p,-pd )  { 1-0  .  Sp,-pd) k_1  <  1-0 . 5p,-pd  )k_2 
=  <l-0.5p.-pd)Jk-J 


p''J{Eishr,  addr  fail,  err  Ax,  i<j<k)  *  { l-pt)2(  1-0.  Sp,-pd)2k 

P‘‘k(Elshr,  addr  fail,  err  Ax)  =  ( l-p( )  ( 1-0 . Sp„-pd)k~2 
Now  if  a  column  k  switch  is  failed  in  the  address  mode  the 
probability  that  it  produces  errors  depends  on  the  number 
of  paths  using  the  switch.  If  j<k  then  two  paths  use  the 
switch  one  from  the  pair  of  A2  paths  and  the  Ax  path.  Thus 
P{no  error  shared  addr  failed  switch)  =  1/4  j<k 

=  1/8  3=k. 

P(err  Ax,  no  error  A2  paths  shared  addr  failed  switch)  -  1/4  j  <k 


1/8 
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This  completes  the  evaluation  of  this  term.  The 
other  three  terms  that  involve  three  active  inputs  can  be 
derived  from  this  one  as  follows: 

P  *  <  A®2n*  AjOA^nAj, )  =  P  •  •  {  A^HA^TIA^  )m0J  ( l-m6 ) 

and 

P '  ‘  ( A* jHaJjDA^  )  =*  1  -  P  •  •  {  -  [  AijnA^nAjj  ]  ) 

»  1  -  P'  '  (A^nA^DA^) 

=  1  -  **•(*„)  -  p'  -  P’  tAjl) 

+  p  -  ■  (aJV'aJi)  +  P  "  (A^OA^)  +  P  ■  •  (A21nA21 ) 

-  p-  {A^nA^nAix) 

Now 

p*  ■  (A^nA^nAjx)  »  P  -  •  { AjjOAx2nA2J ) 

by  symmetry  and  all  the  other  terms  in  the  equation  have 
been  previously  evaluated.  Finally  by  symmetry 
p '  { A^nA^n-  A^nAjj )  =  p '  r  A®nA22nAx2nA22 ) 

and 

p- (AxjnAjjnAx/vAi)  -  p- (AxjrrAjnA^nAjj) 

This  completes  the  evaluation  of  the  terms  with  three 
active  inputs.  Now  the  term  which  has  all  four  inputs 
active  must  be  evaluated.  This  term  is 

p  •  ( AVlAjjfTAjjnAjj  )  =  p  1  •  { Aj2nA22rtAj2nA22  )m0* 

Let 

e  -  (A^nA^nAxViA^l 

To  evaluate  this  term,  first  condition  on  the  number 
of  switches  shared  by  the  two  paths  which  are  primary  for 
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the  Ax  pair  of  paths  and  the  number  of  shared  switches  in 
the  A2  pair  of  paths.  Then  condition  on  the  number  of 
column  k  switches  which  are  shared  between  the  two  paths. 
Note  that  for  this  event  to  occur  it  must  be  that  there  are 
no  errors  in  either  of  the  A2  primary  paths  and  at  least 
one  error  in  each  of  the  Ax  primary  paths.  Thus 
P'  '(E)  = 

’  '  (  E I  AjAj  she  i,  A^Aj  •hc  3>  ■lie  /)P{«ht  Jli,  j)P(»)»r  i,j) 

Now  the  number  of  shared  switches  in  each  set  of 
paths  is  determined  by  the  memories  requested.  These  are 
independent  by  assumption  and  therefore  the  probability  of 
sharing  i  switches  in  the  A*  paths  is  independent  of  the 
probability  of  sharing  j  switches  in  the  A2  paths.  Thus 
have 

P( share  -  <l/2)‘+j  i,j<k 

<  2/W)  ( 1/2) 1  i<k,>k 
(2/N)(l/2)j  j  <k , i-k 
(4/N)  i , J=k 

The  probability  of  sharing  /  column  k  switches  is 
more  difficult  to  determine.  Let  Xm  represent  the  binary 
digits  of  the  memory  addressed  by  one  of  the  Ax  primary 
paths  and  let  Ym  represent  the  digits  of  the  other. 

Similarly  let  Wm  and  Zm  represent  the  digits  of  the 
memories  addressed  by  the  A2  paths.  Now  notice  that  the 
switch  used  in  column  k  is  determined  by  the  first  k-1 
digits  of  the  address  of  the  memory  accessed.  Also  notice 
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that  the  since  the  Aa  paths  share  exactly  i  switches,  the 
first  i— 1  digits  of  their  memory  requests  are  identical  and 
the  ith  digits  are  complements  of  each  other.  Thus  the 
addresses  can  be  expressed  as 


*1*2 

X,,.!  Xk 

xaxa 

i+1 

. *k-l  *k 

*1*2 

"x-l  “k 

ak-l  2k 

Now,  since  the  memory  requests  are  all  independent, 
the  probability  of  sharing  l  column  k  switches  can  be 
determined  by  determining  the  probability  that  l  memory  IDs 
in  the  first  pair  match  /  IDs  in  the  second  pair.  The 
result  is  as  follows: 

i»  j=k  P( share  0)  =  l-0.5k_1 

P{share  1)  «■ 

Pfshare  2}  ■  0 

i-k,  j<k  P{share  0)  -  1  —0  . 

P(share  1)  *  0.51*-1 
P< share  2)  m  0 


i«j<k 


P(  share  0)  -  i_o  .5k",+0 . 5*k'4-i-0 . 51"1  0.2Sk-1-1 
P{ share  1}  -  0.51*-3  -  0.52k"4_i 


kr-l-i 


P{ share  2) 


0.2$ 
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i<j<=k  P ( share  0)  =  1-0. 5k_1 

3  <  i  <  =Jc  P(  share  1)  =  0.5k-1 

P<share  2)  =  0 

Now  define  the  following: 

P‘  ,lJ,(E)  - 

P'  '  (EIAj,Aj  paths  ahr  i,  Aj,Aj  paths  she  j,  sets  share  /  in  column  k) 

The  objective  is  to  evaluate  this  probability  for  all 
possible  values  of  i,  ]  and  l. 

Consider  the  case  of  /=0 .  In  this  case  the  two  path 
sets  are  disjoint  and  therefore  independent.  As  a  result 
P,,ij°{B)  -  P‘  •  {AijOAi,)  P1  1  (aJj/TaJj) 

P‘  ,J<A?2nA$a)  -  (i-o.5p.-p„) 21,-1  j.i 

{l-pf)1<l-0.5p,-pll)Ik_j''1  l<jck 

(l-pf)(l-0.5p,-pa)fc"*(l-0.75p.-Rl)  j=k 


P-  -  1  -  P*  ^{AijpA^) 

-  1  -  P’Na^)  -  P '  ‘ 1  <  Ajj  )  +  P  ’  ' 1  (  AijHAjj  ) 

*  l  -  2{l-0.5p,-pd)k  +  P  '  '  1  (  AjjHAjj  ) 

Now  consider  the  case  of  /=1 .  Here  as  before, 
condition  on  whether  the  shared  switch  is  functional  or 
failed  and  then,  given  that  it  has  failed,  condition  on  the 
failure  mode  and  then,  for  address  mode  failures,  on 
whether  the  failure  produces  an  error  for  the  Aj  path  or 
paths  that  go  through  the  failed  switch.  Note  that  the 
probability  of  the  event  is  0  if  the  failure  causes  an 
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error  for  the  A2  paths  that  pass  through  it.  Thus 

P‘  '*^(8)  =  P '  '  i^1{E  l  shared  last  column  switch  not  failed)  { 1-p*  ) 

+  P‘  Ishared  last  column  switch  addr  failedjp, 

+  P '  ' 1^1 { E I  shared  last  column  switch  data  failed)pd 
Now  if  the  shared  switch  has  not  failed  it  cannot 
cause  errors  for  either  of  the  sets  of  paths.  If  this  is 
true  the  probability  of  errors  in  one  set  of  paths  exclu¬ 
sive  of  that  switch,  is  independent  of  the  same  probability 
in  the  other  set.  Thus 

p,,il1(Elshr  ok)  =  P  '  ‘  i{Aaaf\Aial  shr  ok)  P  ‘  ‘  ^  (AjanA^a  I  shr  ok) 

The  probabilities  in  the  above  equation  are  easily  eval¬ 
uated  and  are  given  by 

P1  ,J<AaanA£alshr  ok)  .  <l-0.5p,-p„)2k~2  J-l 

<l-pf)2{l-0  •5p.-pd)2k_j~1  l<j<k 

<l-pfXl-0.5p,-Pll)k-2  j=k 

and 

P'  NAjjflAjjIshr  ok)  -  1  -  P '  ‘  L{  A^UA^  I  shr  ok) 

*  1  -  P’^A^lok)  -  P  '  ‘  *  ( Aaa )  +  P' ^{A^nA^I  shr  ok) 

-  1  -  2{  1—0. 5p,— pj)^1  +  P*  ^{AjjOAjjIshr  ok)  i=k 

1  -  <  1-0 . 5p,-pd  )k_1  -  { 1-0. 5p,-pd)k+  P'  ' 1  { A^anAaa  I  s  hr  ok)  i<k 

Now  if  the  shared  switch  is  failed  in  the  address 

mode 

P '  '  A  ^ 1  { E  I  »hc  addr  fail)  * 

P' ,2^2{Blahr  addr  fail,  error)  P{erroci  ahr  addr  fail) 

+  P'  '^{Elahr  addr  fail,  no  act)  P  ( no  art  I  ahr  addr  fail) 
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If  the  address  mode  failed  switch  does  not  produce  errors 
for  either  of  the  paths  that  pass  through  it,  the  situation 
is  identical  to  the  case  when  the  shared  switch  is  func¬ 
tional  and 

addr  fail  no  error)  -  P '  '  8 1  shr  no  fail) 


Now  the  probability  that  no  errors  are  produced  de¬ 
pends  on  the  number  of  paths  passing  through  the  switch  and 
thus  on  i  and  j.  Evaluating  gives 

P‘'i^1{no  error  I  shr  addr  fail)  =  1/4  i,j<k 

1/8  i<k,j=k  or  j<k,i=k 

1/16  i=j=k 


Now  if  the  shared  switch  produces  an  error  for  the 
path  or  paths  passing  thVough  it,  the  probability  of  errors 
in  the  other  Ax  path  and  of  no  errors  in  the  Az  paths  must 
be  determined.  Once  again  this  depends  on  the  number  of 
paths  passing  through  the  switch.  If  both  Ax  paths  pass 
through  the  address  mode  failed  switch,  the  possibility  of 


56 


an  error  in  only  one  of  the  Ax  paths  or  in  both  of  the  Ax 
paths  must  be  determined.  Evaluating 
P '' { E I  shared  addr  fail,  error)  P '  ‘  i^1  (error  I  shared  addr  fail) 


=[l-<l-0.5p.-pd)k]  P‘ -‘^{AjjnA^lshr  ok)  (1/4  )  i,j<k 

=  [  l-{l-0.5p„-pd)k]  P*  ,ijl{A£2nA22lshr  ok)<l/8)  i<j=k 

=  [l-<l-0.5p,-pd)k_1]  P'^^fAjXjIshr  ok)(l/4)  j<i=k 

+  P‘  '^{AjjTlAjjIshr  ok)  <  1/8  ) 

=  [  1-  <  1—0 . 5p,-pd  )k_1  ]  P',lil{Aj,nAj2l5hr  ok)(l/8)  i=j=k 

+  P'  ,ljl(A22nA£2lshr  ok)  { 1/1 6 ) 


This  completes  the  evaluation  for  the  case  of  /=1 .  Now 
evaluate  for  1=2 . 

when  1=2  condition  on  the  number  of  failed  switches 

thus 

P  '  ‘ {E  )  =  P  ‘  ' A^2 { E  I  no  shared  failed )  P  '  '  A^2  {  no  shared  failed  ) 

+  P,,A^2(EI1  shared  addr  fail)  P',i^2(l  shared  addr  failed) 

+  P',A^2{E|2  shared  addr  fail)  P‘,A^2{2  shared  addr  failed) 

+  P'  ,A^2(EI1  or  2  shared  data  fail)  P'  ,A^2(1  or  2  shared  addr  failed) 

and 

P',A^2{no  shared  failed)  =  (1-pj)2 

P‘  *A^2(1  shared  addr  failed)  =  2p,{l-pf) 

P',i^2{2  shared  addr  failed)  =  p,2 
Once  again,  if  there  are  no  failures  in  the  shared 
switches,  the  events  that  errors  occur  in  path  sets  are 
independent  and 

P’,AJ2<Blno  fail)  =  P'  ‘  i{Aj2nAja  I  2  ok)  P  '  ‘ 1  (  A22OA22  I  2  ok) 


57 


and 

P'  ,JU”jr>A5jl  2  ok)  =  {l-0.$p,-pd)2k_J  j=l 

=  <l-pf),(L-0.5p.-pd)2k-J‘t  j>l 

and 

P'  ^{AiaHAjjl  2  ok)  =  1  -  P  •  • 1  { A^UAi!  I  2  ok) 

=  1  -  2P,,i{Anll  ok)  +P'  '  i(A”1nAi1l2  ok) 

*  1  -  2{l-0.5p,-pd)1,_l+P'  ^{A^nAjj^  ok) 

Now  if  one  shared  switch  is  failed  in  the  address 
mode  that  switch  may  or  may  not  produce  an  error  in  the  AL 
path  that  goes  through  it.  Thus 
P‘ ,ij2{Ell  addc  fail)  = 

P,,il2{E!l  fail,  error)  P',iJ2(error  I  1  addr  fail) 

+  P',i^2<E(l  fail,  no  err)  f'^Nno  set  I  l  addc  fail) 

=*  1 1—  ( 1-0  .  Sp,-pd ) k_l )  P  ‘  '  ^  { AjjnA^j  I  2  ok)  (1/4) 

+P'  ,iJ2{EI  2  ok)  ( 1/4  ) 

Next  if  both  of  the  shared  switches  fail  in  the  address 
mode,  condition  on  the  number  of  Ax  path  errors  produced 
giving 

P'  ,ij2<E  I  2  addc  fail)  a 

P'  addc  fail,  0  error)  P'  '  ^ 3  { 0  error  I  2  addr  fail) 


iJ2{El2 

addr 

fail. 

1  ecror) 

P'  ,lj2(l  error  1 

1  2  addr 

fail) 

lj2{E12 

addr 

fail. 

2  error) 

P‘  ,iJ*(2  error  l 

1  2  addc 

fail) 
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The  probabilities  of  errors  given  two  address  mode 
failures  are 

P‘  1  ^3(no  ■crsri  I  2  addr  full)  =  1/16 

P'  '*^*{1  •rrocA1  paths,  no  arc  A2  paths  I  2  addr  fail)  m  1/8 
p<  •il,( j  »rrorA2  paths,  no  arc  A2  paths  I  2  addr  fail)  »  1/16 

If  there  are  no  errors,  the  probability  is  equal  to  the 
probability  given  that  there  are  no  failures. 

P‘  2  addr  fail,  no  arror)  a  p‘,llI(Elno  fail) 

With  one  error,  the  probability  that  an  error  exists  in  the 
remainder  of  the  Ax  path  for  which  there  was  no  error  in 
the  shared  switches  must  be  accounted  for.  Thus 
P,,1J*{EI  2  addr  fail,  1  arror)  *  l  1-  ( 1-0 . 5p„-pd  )k_1  ]  P  ‘  1  i  <  A22rv£2  I  2  ok) 
If  there  are  errors  in  both  of  the  Ax  paths  then  only  the 
probability  that  there  are  no  errors  in  the  remainder  of 
the  A2  paths  is  important  and 

P‘,i^*(BI  2  addr  fail,  2  error)  =  P 1  ' 1  ( I  2  ok) 

Finally,  if  the  shared  switch  fails  in  the  data  mode  the 
probability  of  the  event  is  0. 

This  completes  the  evaluation  of  all  terms  necessary 
to  evaluate  the  probability  that  the  two  lines  input  to  a 
column  one  switch  of  the  S/E+  network  are  jointly  active. 

As  stated  previously,  if  indicator  random  variables 
for  each  of  the  lines  in  question  are  defined  such  that 
these  random  variables  are  1  when  the  line  is  active  and  0 
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when  inactive,  the  covariance  of  these  random  variables  is 
given  by 

COVAR  =  P(B”0nBi0)  -  Pl(B10) 

Figure  15  shows  plots  of  the  covariance  vs  pm  for 
various  values  of  k  and  pd.  All  of  the  data  were  calcu¬ 
lated  with  m0  egual  one.  The  maximum  value  of  the  covari¬ 
ance  is  0.05  and  occurs  in  the  region  near  values  of  0.02 
for  pd,  0.00  for  pa  and  16  for  N.  The  value  of  covariance 
decreases  as  the  number  of  processors  and  memories  in¬ 
creases.  Thus  it  can  be  concluded  that  assuming  they  are 
independent  for  purposes  of  calculating  the  expected 
bandwidth  will  induce  only  small  errors  in  the  results. 

Once  the  assumption  that  the  inputs  to  any  stage  one 
switch  are  independent  is  made,  the  argument  of  disjoint 
sets  of  independent  events  can  be  used  to  show  that  the 
inputs  to  all  switches  in  stages  1  to  k-1  are  independent. 
This  being  so,  the  analysis  done  for  the  S/E  network 
applies  to  these  stages. 
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Figure  15b  -  Covariance  as  a  function  of  P,  and  Pd  (H=64) 
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ABS  COVARIANCE  N=256  MAX =.266  xoxis  po(O-I.O)  yoxis  pd(O-I.O) 


Figure  LSc  -  Covariance  as  a  function  of  p,  and  Pd  (H-256) 
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rifura  15d  -  Covarlanco  as  a  function  of  P.  and  P*  (V«1024) 
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The  last  stage  is  not  so  simple.  Figure  16  shows  a 
case  where  the  inputs  to  a  final  stage  switch  are  not 
independent.  To  compute  the  probability  that  a  final  stage 
output  line  is  active,  condition  on  whether  its  input  lines 
are  independent,  thus 
p<b1k) 

=  P{ B1K I  independent  inputs  to  stage  k) P{ independent) 

-fP  { Bu  I  dependent  inputs  to  stage  k)P  (dependent) 


□  D  0  □ 

Figure  16  -  Dependent  Inputs  to  Stage  K 
The  problem  becomes  how  to  calculate  the  probability 
that  the  inputs  are  independent.  The  probability  that  the 
inputs  came  from  dependent  sources  clearly  requires  that 
some  pair  of  processors,  which  are  the  inputs  for  the  same 
column  0  switch,  requested  the  two  memories  which  are 
connected  to  this  switch.  This  probability  is  2/N.  Then 
there  must  be  no  errors  in  the  paths  prior  to  the  last 


switch.  If  there  were  errors  in  either  path  the  requests 
would  either  be  blocked  when  the  error  was  detected  or 
there  would  be  a  conflict  at  the  first  stage  resulting  in 
one  of  the  requests  being  blocked.  Finally  neither  of  the 
requests  could  have  been  blocked  by  other  processor  re¬ 
quests  prior  to  the  last  stage.  Clearly,  for  large  values 
of  N  this  probability  must  be  small  and  the  errors  intro¬ 
duced  by  disregarding  the  effects  of  this  dependency 
should  be  small. 

As  a  result,  the  bandwidth  of  the  S/E+  network  can  be 
approximated  by  calculating  the  probability  that  an  output 
line  from  stage  1  is  active  using  the  derived  equation  and 
then  using  the  results  for  an  S/E  network  to  represent  the 
remaining  stages. 

This  provides  a  simple  and  useful  model  for  estimat¬ 
ing  the  failure  dependent  bandwidth  of  the  S/E+  network. 
Further  it  demonstrates  that  the  bandwidth  of  the  S/E+ 
network  is  strictly  less  than  that  for  an  S/E  network  of 
the  same  size,  when  the  stage  0  control  strategy  is  based 
only  on  error  detection.  The  S/E+  network  increases  the 
probability  that  a  randomly  selected  processor  to  memory 
connection  can  be  made  in  the  presence  of  faults  at  the 
cost  of  communication  bandwidth.  The  system  designer  must 
determine  whether  this  is  a  desirable  trade.  Further  this 
model  provides  a  simple  means  to  evaluate  the  cost  benefit 
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ratio  for  reliability  measures  implemented  at  the  switch 
level . 

Figure  17  presents  some  of  the  results  of  this  analy¬ 
sis.  It  displays  the  bandwidth  vs  probability  of  address 
mode  failure  for  several  values  of  N  and  pd  for  the  S/E+ 
network.  All  of  the  calculations  used  a  value  of  one  for 
m0. 

Comparison  of  Results 

Figures  15  and  17  present  the  calculated  bandwidth 
for  both  the  S/E  and  S/E+  networks.  The  bandwidth  of  the 
S/E+  network  is  strictly  less  than  that  for  the  S/E  net¬ 
work.  Figure  18  displays  the  percentage  loss  in  bandwidth 
suffered  by  the  S/E+  network  as  compared  to  the  base  S/E 
network.  The  bandwidth  loss  is  small  for  small  values  of 
pd.  As  pd  increases  however,  the  percentage  loss  becomes  a 
substantial  part  of  the  available  bandwidth.  Thus,  the 
extra  column  in  the  S/E+  network  serves  to  increase  the 
probability  that  any  given  connection  can  be  made  at  the 
cost  of  a  small  decrease  in  bandwidth. 
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Figure  17*  -  Xoraialiaed  Bandwidth  SE+  <pd=0) 
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Pigur«  17b  -  Normal i red  Bandwidth  SE+  (n,*0.1) 
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«  BW  REDUCTION  SE  -  SE+  Pd  -  .2 
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Pigur*  18c  Bandwidth  Reduction  SE  -  3E+  <p,,=0.2) 
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«  BW  REDUCTION  SE  -  SE+  Pd  =»  .5 
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Figure  IBd  Bandwidth  Reduction  SB  -  SE+  (]^s0.5) 


IV.  VERIFICATION  OF  S/E+  MODEL 


In  chapter  3  a  bandwidth  model  for  the  S/E+  network 
was  developed.  In  the  development  of  this  model  two  impor¬ 
tant  assumptions  were  made.  These  were: 

1. )  The  small  value  of  covariance  allowed  us  to 

assume  that  the  events  that  outputs  from  the 
same  column  0  are  active  are  independent. 

2.  )  The  probability  that  two  active  inputs  to  a 

final  stage  switch  originate  at  the  same  column 
0  switch  is  small  and  can  be  ignored. 

In  order  to  test  the  validity  of  the  assumptions  a 
simulation  of  the  S/E+  network  was  developed.  The  simula¬ 
tion  is  based  on  the  following  set  of  equations: 

E(BW)=^  E(BWIfail  state)  P{  fail  state) 

■JU  fail  states 

E(BWIfail  state)=^  E(BWIfail  state,  input)  P  {  input) 
all  inpata 

B(BWIfail  state,  input )  = 

^  B  (  BW I  fa  ,  inp,  conflict  state)  P{  conflict  state) 
allosafliet  states 

In  the  first  of  these  equations  the  expected  bandwidth 
is  obtained  by  conditioning  on  the  failure  state  of  the  ICN 
and  then  summing  over  all  possible  failure  states.  Here  a 
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failure  state  is  defined  as  one  particular  failure  config¬ 
uration  from  the  set  of  all  possible  ICN  switch  failure 
sets.  This  set  includes  the  case  of  no  failures.  Next,  in 
order  to  calculate  the  expected  bandwidth  given  a  particu¬ 
lar  failure  state,  condition  on  the  input  state  —  the  set 
of  applied  input  requests  —  and  sum  over  all  possible 
input  request  sets.  Finally,  given  the  input  and  failure 
states,  condition  on  the  resolution  of  conflicts  at  the  ICN 
switches  and  sum  over  all  possible  resolutions. 

These  equations  represent  a  complete  calculation  of 
the  expected  bandwidth.  Unfortunately  they  cannot  be  com¬ 
pletely  evaluated  in  a  reasonable  amount  of  time  on  cur¬ 
rently  available  computer  systems.  In  order  to  demonstrate 
this  consider  an  8x8  8/E+  ICN.  The  ICN  is  composed  of  16 
switches  each  of  which  can  assume  one  of  four  states  — 

operational,  stuck  at  X,  stuck  at  T  or  data  mode  failure  - 

this  results  in  41S  or  23a  possible  failure  states.  For 
each  of  these,  even  if  the  evaluation  is  restricted  to  the 
set  of  inputs  for  which  each  processor  is  active  —  m0  is  1 
—  8s  or  224  possible  input  states  must  be  evaluated.  For 
each  of  these  input  states  every  possible  conflict  state 
must  be  evaluated.  Given  an  input  state  consisting  of  8 
requests  this  will  require  the  evaluation  of  from  1  to  as 
many  as  2  7  conflict  states.  Thus,  a  complete  calculation 


77 


for  an  8x8  S/E+  ICN  would  require  that  a  minimum  of  254 
total  network  states  be  evaluated.  This  is  clearly  not 
feasible  even  on  the  fastest  of  modern  computers. 

The  number  of  states  that  must  be  evaluated  can  be 
reduced  as  follows:  First  observe  that  the  maximum  value 
of  the  covariance  occurs  when  pa  is  0.  Using  this,  for 
verification  purposes,  the  calculation  can  be  restricted  to 
the  case  where  pB  is  0.  This  results  in  216  failure 
states.  Next,  note  that  the  covariance  and  the  probability 
that  two  inputs  to  a  final  column  switch  originate  at  the 
same  column  0  switch  both  decrease  as  N,  the  number  of 
processors  and  memories  attached  to  the  ICN,  increases. 
Thus,  the  maximum  error  induced  by  the  assumptions  in  the 
model  for  S/E+  bandwidth  will  occur  for  small  values  of  N. 
Since  a  4x4  S/E+  ICN  is  trivially  equivalent  to  a  4x4 
crossbar,  the  simulation  was  performed  for  an  8x8  ICN. 
Finally,  expected  bandwith  can  be  approximated,  without 
inducing  large  errors,  by  calculating  it  for  a  large  number 
but  less  than  8s  random  input  states. 

Restricting  the  failure  modes  to  the  data  only  mode 
allows  the  expected  bandwidth  to  be  expressed  as  a  16th 
order  binomial  equation  in  pd  where,  letting  f  represent 
the  number  of  failures,  the  coefficients  can  be  calculated 
from  the  equation: 

Cf  =  £  E  { BWI  fail  atata) 
all  fail  atataa  with  t  faila 
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and  the  expected  bandwidth  is  given  by: 

k<bw)=  £  ct  U-r,)116-0 

f-i 

Again  this  represents  a  full  calculation  of  the  ex¬ 
pected  bandwidth  given  that  all  failures  are  data  mode  type 
failures.  Complete  evaluation  of  coefficients  (cf)  in  this 
equation  is  not  computationally  feasible.  An  approximation 
can,  however,  be  made  by  restricting  the  number  of  input 
requests  used  in  the  calculation.  Let  n  represent  the 
number  of  random  input  states  used  to  calculate  each 
expected  bandwidth  given  a  failure  state  used  in  the  above 
equations.  Then  by  the  central  limit  theorem 
E{  BW|  fail  State *  Normal <  , o\/ n ) 
where  fli  is  the  actual  expected  bandwidth  given  a 
particular  failure  state  and  <Ji  is  the  deviation  of  the 
sample  distribution.  Given  that  this  is  true  the 
coefficients  in  the  binomial  equation  are  distributed  as 
follows : 

(£)  (V) 

Cf  -  Normal  (  ^  UL,  a\/n) 

and  the  calculated  expected  bandwidth  is  distributed  as 
E { BW ) c  -  Norma Hfi'  ,  O'2) 

where 

M'  -  £  P$U-R,  ^  Mtl 
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and 

Oa  =  £  pS,(l-R,)2,1‘-n  5 

The  fl'  above  is  the  actual  expected  bandwidth  and  a'  is 
the  deviation  of  the  bandwidth  calculated  using  the 
simulation.  In  order  to  determine  a  bound  on  the  deviation 
define  <7  such  that  it  is  greater  than  fffi  for  all  values 
of  f  and  i.  Then 

°  s  a««/n  f2w  Pu  u-Pd)  ( *  > 

* 

Using  a  two  point  equally  probable  0,  8  distribution,  it  is 
easily  shown  that 

<*lm  ~  16 

This  gives  a  .99  confidence  interval  of  +  .09  for  n  of 
20,000  which  was  used  in  the  simulation. 

A  comparison  of  the  expected  bandwidths  calculated 
using  the  model  and  the  simulation  is  presented  in  figure 
19.  Figure  20  is  a  plot  of  the  absolute  difference  between 
the  model  and  the  simulation.  Figure  21  is  a  plot  of  the 
covariance  for  pa  equal  to  zero.  The  maximum  value  of  the 
error  occurs  when  pd  is  0.17  and  represents  approximately  a 
15%  error. 

Figure  22  shows  the  model  error  as  a  percentage  of  the 
simulation  bandwidth.  The  percentage  error  continues  to 
increase  after  the  absolute  error  begins  to  decrease  but 
remains  below  15%  for  values  of  pd  below  0.20  which  is  the 
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area  of  primary  interest.  Further  it  is  expected  that  the 
absolute  and  percentage  errors  will  be  maximums  at  the 
conditions  presented,  that  is  for  pa  zero  and  for  a  low 
value  of  N,  the  number  of  processors  and  memories  connected 
to  the  ICN. 

Thus  the  model  developed  provides  a  reasonable  approx¬ 
imation  to  the  expected  bandwidth  even  in  the  worst  case 
conditions  for  the  model.  The  percentage  error  in  the 
expected  bandwidth  calculated  using  the  model  should  de¬ 
crease  as  N  increases  and  as  pa  increases.  It  is  not, 
however,  computationally  feasible  with  currently  available 
computational  resources,  to  simulate  the  ICN  for  either  a 
larger  value  of  N  or  for  various  values  of  pa. 


EXPECTED  BW 
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Figure  19  —  Expected  Bandwidth  Model  vs  Simulation 
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V.  EVALUATION  OF  RELIABILITY  MEASURES 


AN  EXAMPLE 


In  this  chapter,  an  example  designed  to  demonstrate 
the  use  of  the  model  developed  in  chapters  two  and  three, 
to  evaluate  various  reliability  enhancement  measures  as 
applied  to  the  S/E  and  S/E+  ICNs,  is  presented.  Before 
this  can  be  done,  however,  two  tasks  must  be  accomplished. 
First,  as  was  observed  in  chapter  three,  the  expected 
bandwidth  of  the  S/E+  ICN  is  strictly  less  than  that  of  the 
S/E  ICN.  If  these  two  networks  are  to  be  compared 
directly,  connectivity  measure  must  be  developed.  Such  a 
measure  should  favor  the  S/E+  ICN.  When  combined  with  the 
bandwidth  analysis,  it  should  allow  comparison  of  the  S/E 
and  S/E+  ICNs.  Next,  p,  and  pd  for  the  switches  used  in 
the  ICN  must  be  available.  For  the  purposes  of  this 

example,  a  hypothetical  switch  model  which  can  be  used  to 
calculate  p,  and  pd  will  be  developed. 

Connectivity  Equations 

Cherhassky { 7 ]  has  developed  equations  which  can  be 
used  to  calculate  the  probability  that  any  pair  of  ter¬ 
minals  can  be  connected  through  a  tree  structured  ICN  in 
which  data  mode  faults  occur.  These  equations  are  not, 
however,  sufficient  for  this  example  as  they  are  valid  only 
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in  the  asymptotic  sense  and  they  do  not  extend  to  the 
address  mode  stuck  fault  model.  Thus,  complete  equations 
using  the  more  complete  fault  model  must  be  developed. 

This  is  quite  simple  for  the  S/E  1CN  as  this  ICN  contains 
only  a  single  path  from  any  input  to  any  output.  If  a 
connection  is  to  be  made,  the  path  between  the  desired 
input  and  output  must  be  fault  free  or  must  have  only  stuck 
at  address  mode  failures  which  allow  the  desired  con¬ 
nection.  Let  P(C)  represent  the  probability  that  a 
randomly  selected  pair  of  terminals  can  be  connected  using 
the  desired  ICN.  Then 

p(C)a/e  =  <l-0.5pa-pd)X 

In  the  S/E+  ICN;  -there  are  two  paths  from  any  input 
to  any  output  terminal.  The  desired  connection  can  be  made 
if  either  of  these  paths  is  capable  of  making  the  con¬ 
nection.  These  paths  are  not,  however,  independent  as  they 
pass  through  the  same  first  and  last  stage  switches.  Thus, 
in  order  to  calculate  P(C)S/E+  condition  on  the  failure 
state  of  these  switches.  Thus 

P(C)t;t+»  P(  Cl  stage  0  ok.)  (  l-pf  )+P  (C  I  stage  0  address  fail)pa 

Now 

P  (C )  (I  stage  0  ok)=P '  ( C  I  stage  k  ok)  ( 1— pf  )+P '  (C I  stage  k  address  fail)pa 

Where  again  P' (C)  is  used  to  indicate  the  conditional 
probability  given  that  the  stage  0  switch  in  question  is 
functional.  If  both  the  stage  0  and  stage  k  switches  are 
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functional,  the  event  that  a  connection  can  be  made  occurs 
if  either  of  the  paths,  exclusive  of  the  first  and  last 
stages,  are  functional  for  the  desired  connection.  Thus 

P  ‘  S/>E+(C  I  stag®  k  ok  )=l-[  l-(  1-0 . 5pa-pd  ) k-X  ]  2 
If  the  stage  k  switch  has  failed  in  the  address  mode  and 
the  stage  0  switch  is  functioning,  the  connection  can  be 
made  if  the  intermediate  path  from  stage  0  to  the 
appropriate  stage  k  input  is  error  free.  Thus 

P  s/t+(  C  I  stage  k  address  f  ail  )  =  (  1— 0 . 5pa— pd)k-X 
If  the  stage  0  switch  has  failed  in  the  address  mode  then 
the  connection  can  be  made  if  the  remainder  of  the  path 
from  the  appropriate  output  from  stage  0  is  error  free. 

Thus 

Ps/E4< C  stage  0  address  f  ail  )  =  (  1-0 . 5pa— pd ) k 
The  above  equations  can  be  combined  to  give  the  probability 
that  a  random  pair  of  terminals  can  be  connected  using  the 
S/E+  ICN . 

Switch  Model 

The  basic  switch  model  is  shown  in  figure  23.  It 
consists  of  a  data  switching  circuit  and  a  separate  address 
control  and  contention  logic  section.  In  addition,  the  ICN 
designer  also  has  available  1  bit  wide,  2  of  3  majority 
voters  packaged  in  either  single  voter  SSI  integrated  cir¬ 
cuit  or  a  25  voter  integrated  circuit  contained  in  a  128 
pin  JEDEC  package  as  well  as  a  50  bit  wide  single  error 


88 


correcting  double  error  detecting  unit  (ECC).  Thus  the 
fault  tolerance  strategies  available  are  triple  modular 
redundancy  ( TMR ) ,  single  error  correction  (ECC)  or  any 
combination  of  the  above. 


Figure  23  —  Switch  Model 

The  primary  failure  mode  for  all  of  components  used 
in  the  switch  design  is  assumed  to  be  single  bit  stuck  at 
failures.  Further  assume  that  bits  within  a  given  circuit 
are  equally  likely  to  fail  and  have  independent  Poisson 
failure  distributions.  These  assumptions  were  designed  to 
simplify  the  switch  model.  It  should  be  realized  that  the 
purpose  of  the  switch  model  is  to  derive  p#  and  pd  so  as  to 
demonstrate  the  bandwidth  model.  The  primary  purpose  of 
this  example  is  to  demonstrate  the  utility  of  the  bandwidth 
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model  in  evaluating  various  reliability  enhancement 
options.  An  unduly  complicated  switch  model,  while  perhaps 
more  realistic,  complicates  and  obscures  that 
demonstration . 

Table  1  shows  the  estimated  and  relative  failure  rates 
for  the  components  available  to  the  system  designer.  The 
estimated  failure  rates  are  based  on  the  gate  complexity  of 
the  integrated  circuits  and  are  taken  from  Siewiorek[] 
table  D— 6  which  is  based  on  the  military  handbook  217B 
reliability  model  for  integrated  circuits.  The  last  column 
of  table  1  is  the  failure  rate  relative  to  the  single  bit 
failure  rate.  All  data  presented  for  comparison  later  in 
this  chapter  will  be  presented  as  a  function  of  time  nor¬ 
malized  by  the  bit  failure  rate.  The  equations,  which  will 
be  used  to  determine  pB  and  pd  given  the  assumptions  and 
failure  rates  for  the  switch  model,  will  now  be  developed. 

Table  1  —  Failure  Rates 


FUNCTION 

♦GATES 

X  X 

REL 

X  REL  /  BIT 

Data  switch 

300 

1 .1668 

50 

1.0 

Address 

100 

0.4839 

20 

na 

Voter ( single ) 

4 

0.1207 

5 

5 

Voter (25) 

200 

0.4935 

20 

0.8 

ECC 

200 

0.751283 

30 

na 

90 


The  data  path  for  the  basic  system  will  be  45  bits 
wide  and  consists  of  32  data/memory  control  bits  10  ICN 
address  bits,  1  ICN  control  line  and  2  parity  lines.  When 
ECC  is  used  the  parity  bits  are  replaced  with  7  ECC  bits 
resulting  in  a  50  bit  wide  data  path  through  the  network. 
The  actual  data  path  could  decrease  by  one  bit  for  each 
stage.  This  would,  however,  require  different  data 
switches  for  each  stage  and  would  also  require  a  new  ECC 
check  bit  calculation  and  insertion  at  each  stage.  As  a 
result,  assume  that  the  data  path  width  remains  constant 
except  at  the  stage  zero  of  the  S/E+.  There,  the  address 
control  bit  for  stage  0  is  removed  and  not  passed  to  the 
remainder  the  network.  This  is  consistent  with  memory  unit 
address  error  checking  as  the  stage  0  control  bit  selects 
the  ICN  path  but  does  not  effect  the  memory  module 
addressed . 

The  ICN  models  require  that  the  events  represented  by 
pB  and  pd  be  disjoint.  If  both  data  mode  and  address  mode 
failures  have  occurred  in  the  same  switch  it  must  be  con¬ 
sidered  as  a  data  mode  failure.  Let  PD  represent  the 
probability  that  a  data  mode  failure  has  occurred  in  a 
switch  and  let  PA  represent  the  probability  that  an  address 
mode  failure  has  occurred  in  the  switch.  The  events 
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represented  by  PA  and  PD  are  independent  in  the  switch 
model.  Thus 

Pd  =  PD 

p.  =  PA(l-PD) 

These  equations  are  valid  for  this  switch  model  regardless 
of  the  enhancement  features  that  may  be  used.  Now  PA  and 
PD  must  be  derived  for  the  switch  model. 


Comparison  of  Reliability  Options 

Pigure  24  shows  the  expected  bandwidth  as  a  function 
of  Xt  for  a  1024x1024  S/E  ICN  with  several  reliability 
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enhancement  measures  applied  to  the  ICN.  The  configura¬ 
tions  presented  are  the  base  or  unenhanced  ICN,  the  ICN 
with  TMR  applied  to  both  the  address  logic  and  the  data 
path  outputs,  the  ICN  with  ECC  applied  to  the  data  path  and 
no  address  redundancy,  and  the  ICN  with  TMR  applied  to  the 
address  logic  and  ECC  applied  to  the  data  path.  Figure  25 
presents  the  same  data  for  an  S/E+  network.  In  addition 
the  configuration  designated  as  mixed  consists  of  the  S/E+ 
ICN  with  ECC  applied  to  the  data  path  and  TMR  applied  to 
the  address  logic  of  the  first  and  last  stages  only.  This 
is  is  a  resonable  configuration  for  the  S/E+  ICN  as  it  is 
in  these  stages  that  the  two  paths  from  any  input  to  any 
output  pass  through  the  same  switch.  Figures  26  and  27 
show  the  probability  that  a  random  connection  can  be  made 
for  each  of  the  above  configurations.  In  figures  28  and  29 
the  probability  of  connection  and  the  expected  bandwidth 
for  the  S/E  and  8/E+  are  compared  for  two  configurations. 

For  values  of  Xt  less  than  0.006,  the  maximum 
expected  bandwidth  is  obtained  by  applying  TMR  to  the 
address  logic  of  the  ICN  and  ECC  to  the  data  path.  For 
values  greater  than  this  the  maximum  is  obtained  by 
applying  TMR  to  both  the  address  logic  and  to  the  data 
path.  For  values  of  Xt  greater  than  approximately  0.003 
the  probability  of  connection  is  so  low  that  the  ICN  is, 
for  most  practical  applications,  no  longer  functional. 

Thus,  in  this  case,  the  selection  of  TMR  applied  to  the 
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address  logic  and  ECC  to  the  data  path  is  clearly  superior 
in  terms  of  ICN  bandwidth. 

As  was  noted  in  chapter  3,  the  expected  bandwidth  for 
the  S/E+  network  is  always  lower  than  than  for  a  similarly 
configured  S/E  network.  However,  the  probability  that  a 
randomly  chosen  connection  can  be  made  is  higher  for  the 
S/E+.  The  is  especially  significant  for  low  values  of  Xt 
as  is  clearly  demonstrated  in  figures  26  and  27. 

Table  2  lists  the  relative  costs  for  each  of  the  above 
configurations.  This  cost  is  based  on  the  total  number  of 
gates  required,  normalized  by  the  number  of  gates  in  an 
unenhanced  S/E  ICN.  Once  the  maximum  unrepaired  operating 
time  or  desired  mission  time,  the  minimum  acceptable  band¬ 
width  and  minimum  acceptable  probability  of  random 
connection  are  specified,  the  system  designer  can  use  the 
cost  model  and  figures  24—27  to  determine  the  minimum  cost 
ICN  meeting  these  constraints. 

The  mixed  configuration  for  the  S/E+  ICN  serves  to 
illustrate  another  feature  of  the  bandwidth  model  which  was 
not  previously  mentioned.  The  equations  developed  in  chap¬ 
ter  3  assumed  that  pB  and  pd  were  constant  for  all  switches 
in  the  network.  There  was,  however,  nothing  in  the  devel¬ 
opment  which  required  that  this  be  true.  The  only  require¬ 
ment  is  that  pm  and  pd  be  constant  for  each  column  in  the 
network.  As  a  result,  only  minor  modifications  are  re¬ 
quired  to  extend  the  bandwidth  model  to  such  situations. 
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This  was  done  to  compute  the  expected  bandwidth  with  TMR 
applied  only  to  the  first  and  last  stage  address  circuits 
rather  than  to  all  address  circuits  in  the  ICN. 

The  relative  bandwidths  shown  in  figure  24  only  apply 
for  the  pa  and  pd  derived  above  and  these  results  should 
not  be  considered  applicable  to  all  S/E  and  S/E+  ICNs. 
Rather  the  figure  demonstrates  the  use  of  the  model  to 
determine  the  relative  merits  of  various  configurations 
given  that  pa  and  pd  as  a  function  of  time  are  known. 

Table  2  —  Relative  Cost  Based  on  Gate  Count 


ICN  ENHANCEMENTS  RELATIVE  COST 


S/E  NONE 

TMR 
ECC 

TMR/ECC 

S/E+  NONE 

TMR 
ECC 

TMR/ECC 

MIXED 


1 . 00 
3 . 55 
1.62 
2 .17 
1 .10 
3.91 
1  .78 
2.39 
1.89 


expected  Bandwidth 


0.000 


0.002  0.004  0.006  0.008  I 

Normalized  Time 

□  Unen honced  A  TMR  0  ECC  X  TMR/ECC  +  Mixed 


Figure  25  —  Expected  Bandwidth  vs  Tine  S/E+  ICN 
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Normalized  Time 

DUnenhonced  A  TMR  0  ECC  X  TMR/ECC  +  MIXED 

Figure  27  —  Probability  of  Connection  3/E+  ICN 
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Normalized  Time 

a  S/E  BASE  AS/E+  BASE  OS/E  TMR-ECC  XS/E+  TMR-ECC 


Figure  28  —  Bandwidth  Comparison  3/E  vs  S/E+ 


Connection 
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Normalized  Time 

□  S/E  BASE  A  S/E+  BASE  OS/E  TMR-ECC  XS/E+  TMR-ECC 


Figure  29  —  Connection  Probability  Comparison  S/B  vs  9/E+ 


VI.  CONCLUSIONS 


This  dissertation  documents  a  study  of  a  bandwidth 
analysis  of  shuffle-exchange  (S/E)  and  augmented  shuffle- 
exchange  ( S/E+ )  interconnection  networks  composed  of  binary 
crossbar  switches.  These  networks  are  intended  for  use  as 
interconnection  and  communication  networks  (ICNs)  in  large 
multiprocessor  computer  systems  and  are  topologically 
equivalent  to  a  much  larger  class  of  interconnection  net¬ 
works.  This  chapter  summarizes  the  results  obtained  and 
suggests  some  areas  for  further  research. 

Summary  Qf  Regyltg 

The  major  contributions  of  this  work  are  summarized 
as  follows: 

1.  An  analysis  technique  which  allows  the  predic¬ 
tion  of  ICN  bandwidth,  in  the  presence  of  cer¬ 
tain  types  of  failures,  has  been  developed  for 
the  S/E  and  S/E+  ICN. 

2.  It  has  been  shown  that  the  relatively  simple 
analysis  done  for  the  minimal  networks  composed 
of  log2N  stages  (S/E)  can  be  extended  to  net¬ 
works  augmented  with  an  extra  stage  (S/E+) 
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provided  that  the  extra  stage  is  appropriately 
considered . 

3.  The  fault  models  used  previously  by  other 
researches  have  been  extended  by  incorporating 
both  address  mode  faults  and  data  mode  faults 
into  a  single  model. 

4.  an  example  which  demonstrates  the  use  of  the 
model  to  select  among  various  reliability  en¬ 
hancement  measures  in  the  design  of  multiproces¬ 
sor  ICNs  was  presented. 

The  binary  crossbar  shuffle— exchange  ICN  is  composed 
of  log2N  stages,  each  consisting  of  N/2  binary  crossbar 
switches,  where  N  is  the  number  of  input  and  output  termin¬ 
als  of  the  network.  The  stages  are  interconnected  using  a 
perfect  shuffle  connection  pattern.  Such  networks  have 
been  shown  to  be  topologically  equivalent  to  a  much  larger 
class  of  networks  which  use  different  stage  to  stage  con¬ 
nection  patterns [ 22, 27 , 28 ] .  Thus  results  obtained  here  are 
applicable  to  many  other  proposed  interconnection  networks. 

The  augmented  shuffle-exchange  ICN  (S/E+)  consists  of 
log2N+l  stages.  These  stages  are  again  interconnected  with 
the  perfect  shuffle  connection  pattern.  The  S/E+  provides 
a  redundant  connection  path  from  any  input  to  any  output  as 
compared  to  the  S/E  network  and  thus  is  desirable  as  a 
reliability  enhancement  to  the  S/E  network. 
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Bandwidth  models[ 7,23,24 ]  have  been  previously 
developed  for  these  networks  in  either  the  fault  free  state 
or  in  the  presence  of  data  path  faults  —  those  in  which 
component  switches  pass  no  useful  data.  Additionally,  the 
effect  of  address  mode  faults,  in  which  the  data  is  passed 
unmodified  but  is  incorrectly  routed,  on  the  connection 
capability  of  the  network  have  been  studied [ 27 , 28 J .  Prior 
to  this  work,  however,  these  two  fault  models  have  not  been 
combined  in  a  single  analysis,  and  no  bandwidth  analysis 
has  been  made  using  the  address  mode  fault  model.  The 
bandwidth  model  developed  in  this  work  is  computationally 
simple  and  allows  the  estimation  of  the  bandwidth  of  these 
networks  given  that  the  probabilities  of  address  and  data 
mode  failures  for  the  network's  component  switches  are 
known.  The  availability  of  such  a  model  allows  the  ICN 
designer  to  estimate  the  effectiveness  of  switch  level 
reliability  measures  on  network  and  therefore  system  per¬ 
formance.  This  then  allows  assessment  of  the  cost  benefit 
ratio  for  various  enhancement  schemes. 

Suggested  Further  Research 

The  bandwidth  model  developed  in  this  work  is  limited 
in  that  it  only  applies  to  a  random  distribution  of  output 
port  requests  made  at  the  ICN  input  ports.  Previous 
research[ 22,23,37,38]  has  shown  that  this  assumption  has 
little  effect  on  ICN  bandwidth  in  the  fault  free  state. 
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However,  the  control  strategy  for  a  fault  tolerant 
computing  system  would  necessarily  restrict  the  accesses  of 
any  given  input  port  to  those  to  which  connection  was 
possible.  The  development  of  a  bandwidth  model  in  which 
non— random  input  distributions  are  allowed  would  improve 
the  ability  to  estimate  system  bandwidth  in  the  presence  of 
such  reconfiguration. 

Several  researchers  have  proposed  ICNs  composed  of 
crossbar  switches  of  radix  greater  than  two  as  well  as 
mixed  radix  switches [ 5 ,21 ] .  The  development  of  bandwidth 
models  for  such  systems  would  be  a  natural  extension  of 
this  work.  It  would,  however,  involve  substantial  work  as 
the  number  of  address  mode  faults  possible  at  each  switch 
are  significantly  larger  and  the  analysis  thus  more 
complicated . 
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iicclude  <stdio.h> 
unsigned  int 
input)  8) 

, tags) 8) 

,faiistate(  4  )  |  4  ) 

, sw itchstste 1 4 ]  1 4  1 

, conflicts |  4 )  |  4  ) 

, redtags ( 8 )  I  8] 

, totalconf ( 4 ] 

, *iinea( 5) I  8) 

, outmapl) 8) 

=10,2,4,6,1,3,5, 
, outmapk) 8) 

=10,1,2,3,4,5,6, 
, numfails) 4 ) 


/*the  input  requests  for  each  processor*/ 

/‘source  id,  redundancy  tag,  dest  id  for  each 
processor*/ 

/*for  each  switch  0=ok,  l=SatT,  2=satx,  3=data  fail*/ 
/*for  each  switch  0=not  in  use  l=set  at  t, 

2= set  at  X*/ 

/♦for  each  switch  l=inputs  conflict  O=noconflict*/ 
/♦redundancy  tags  for  each  processor  memory  pair*/ 
/♦the  number  of  conflicts  in  each  stage*/ 

/♦pointers  to  the  controling  tag  for  each 
stage,  line*/ 

/♦maps  outline  at  one  stage  to  inline  of  next*/ 

7  > 

/♦map  for  last  stage  so  it  can  be  treated  the*/ 

7)  /*same  as  others*/ 

/♦used  to  contain  the  total  number  of  failures  in*/ 
/♦a  state  ie  numfails) l]=nuber  Sat?  etc*/ 


, numfail 
, firstcall 

, initcall=l  /*  when  this  is  l=true  the  value  in  currenstate  is*/ 

;  /♦  the  first  state  evaled,  all  others  are  normal*/ 

/*  note  well  the  strong  dependence  on  numerr.  the*/ 
/♦  nuber  of  errors  must  match  the  initial  value  in*/ 
/♦  in  currentstate  */ 


char  outfn) 80 )=' se.dat\0 ‘ , 
ifn) 80)='se.dat\0 ' ; 


unsigned  long  seed=1234 567 891 , failstat, currentstate; 


main) ) 

(  unsigned  int  stage, i, j, done, count) 8J ; 
double  expc,sim(); 
long  getstate) ) , temp ; 

FILt  *ifpj 

srand) seed ) ; 

if  )(ifp=fopen(ifn, 'r' ))==NULL)  I 

printf) 'error  opening  file  I20s\n* , ifn) ; 
currentstatesOl ; 

> 

else  I 

while) fscanf) ifp, ' »4X  »*d  t*d  l*d  «*lf * , icurrentstate) <=E0F) ; 
fclose) ifp) ; 

) 

numf a i 1=0 ; 


Ill 


tempscuccentatate; 

foe  (i-1;  i<  =  16;  i+4,temp>  >=1)  if  ( tempi  1)  numfail-H; 
pcintf ( ' cuccentatate=t 8x  numfail-  12d * , cueeentatate, numfail ) ; 
if  (cuccentatate=(  1<  <numfail)-l )  ( 
numfail-H-; 
initcall-O; 

) 

elae  I 

fi catcall-0; 
get at ate ( (done) ; 
initcall-1; 

J 

initlineal ) ; 

for  ( ; numfail <=12 ; numfail-H)  ( 
initatate( numfail ) ; 
done— 0 j 

while  ( (done) I 
failatat— getatate(tdone) ; 
if  (checkblock( failatat) ) ; 

elae  ( 

aetfailatatei ( failatat) ; 
aetcedtagat ) ; 
expc-0 ; 

foe  (1=1; )<=20000; j-H) ( 
getinpj ) ; 
settaga ( ) ; 
expe-f— aim(O)  ; 

> 

expc-expc/(  J-i); 
wciteoutl failatat, expc) ; 


int  checkblock( failatat) 
unaigned  long  failatat; 

< 

unaignad  int  i,tamp; 
temp-0; 

foe  (  i— 0;  i<=3;  i-H)  I 

if  ( ( failatat&Oxf )==0xf )  temp-1; 
failatat> >-4 ; 
l 

cetucn(temp) ; 
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initstatet  n ameer ) 
uns i gned  numer c ; 

unsigned  i; 

if(initcall)  initcall=Q;  eLael 
currentstate=01 ; 

for( i=0 ; i < numer r ; i++)  currentatnte=cur rentstate< <111; 
currentstate<  <=16- i ; 

} 

firstcallsl; 


long  getatate(done) 
int  *done; 

1 

unsigned  i ,  numO ,  nuail ; 
unsigned  long  tssqistate; 
tempatate=cur rentatite ; 
if (firatcall)  I 
firstcall»0; 

if  (teepstate=0  II  tempstate=Oxffff )  *done=l;  else  *done=0; 
return (tempatate) ; 

> 

nuai0=numl=0 ; 

while(tenpstate«l  it  nunl<16)  ltempstate> >=1 ; numl++; » 
whilst ! (tenpstatetl)  it  nuaO+numl < 16 )  {tempstate> >=1 ; numO++; t 
if  (numl+num0=16)  ( 

printf | '\nERROR  IM  GETS7ATE  numl=!2d  numO=l 2d  1 , Dual , numO ) ; 
*done=l; 

return(cuccentstate) ; 

I 

tempatate >  >=1 ; 
tempatate< <»2; 
tempatate 1=1 ; 

for  t i=0 ; i<numl ; i++)  tempatate=tempstate< <111; 
temp state < <=numO-l ; 
cur  rentatatemteepstate ; 

if  (currentatate=(  1  <  < numfa il  )-l )  *done=l;  else  *done=0; 
ret urn (cur tent state) ; 


getinpt ) 

I 

unsigned  int  proc; 

for  tproc*0;proc<s7 ;proc+4)  input | proc )=rand( ) >>2667 ; 
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setfailstatel | failstat) 
unaigned  long  failstat; 

< 

unaigned  int  stage, swtch, i ) 

for  (  i=0;  i<4  ;  i-f-t)  numfailsf i}=0; 

for  (stage=0;stage<4 ;stage44)  Cor  I swtch=0 ; swtch< 4 ; fa ilstat> >=1, swtch++) 
it  ( failstat&l)  ( 

failatate) stage) I swtch 1=3 ; 
numfailsf  3  )-M>; 
t 

else  Cailstatel stage) 1 swtch)=0; 

> 

set£ailstate2( failstat) 
unsigned  long  failstat; 

( 

unaigned  int  stage, swtch, state; 

for  (atage=0;stage<4  ;atage-f4)  for  ( awtch=0;swtch<4  ;  failstat>  >=2,  swtch+4)  c 
state=f ailstattl ; 
fa ilstatet stage) ( swtch ) estate ; 
nuatfails  (state)  ++ ; 

> 


settags'f  I 
1 

unsigned  int  proc,mem; 
for  lproc=0;ptoc<=7  ;proc-H-)  J 
mencinputtproc) ; 

tags(procl  =  (proc<  <4 )  +  ( redtags(proc)  (mem)  i(3|Mem; 
) 

) 

initlinesl ) 
l 

int  line; 

for  (linem0;line<=3;line+4)  < 

lines(O) ) 2*line)=( fitagsl line) ) ; 
linea(O) | 2*line-fl)  =  (  ttags(llne+4 ) ) ; 

) 

oullineat ) ; 
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null ines ( ) 

I 

unsigned  int  stage, line; 
for ( stages 1 ; stage<  5 ; stage++ ) 
for! line=0; line<=7  ;  1  ine-W-)  lines(stage)  (line)=NULL; 
l 

wr iteout ( failst,  val ) 
double  val; 
long  failst; 
l  FILE  *fp; 

fp=fopen( outfn,  *a‘  )  ; 

fpcintf  ( fp,  *  *91x  13d  t3d  *3d  *13. 10f\n* , failst, numf ails| 1| ,  numfailsl  2J , 

□umfails | 3 ) , val ) ; 

f closet  fp) ; 

> 

set redtags! ) 

< 

unsigned  int  proc,eem, tag, stage, sO ; 
for(proc=0;proc<=7  ;proc-M-)  I 
sOaproc  >  >  3  £  1 ; 

for  (me«c0;aiem<=7  ;ae«M  I )  ( 

tag=(proc  <<  4)+(s0  <<  3)+mem; 
redtags(proc) Inem)=a0; 
stages 1 ; 

while!  I  stage<=3  )  it  (  redtags  l  proc  ]  ( mem)=sO  )  )  I 
switch! failstate! stage) ( (tag  >> 4-stage) £3 1 )  l 
case  0  ;  break; 

case  1  :  if  |(tag>>  3-stage  £  l)"(tag>>  6-stage  £  1)) 
redtags! proc) |msm)='sO£l; 
break; 

case  2  :  if  (t((tag>>  3— stage  £  l)~(tag>>  6-stage  £  1))) 
redtagsf proc) ( mem)='sO£ 1 ; 
break; 

case  3  :  redtags  (proc)  (mem)='sO£l; 
l 

stage-M- ; 
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aetl inea (stage) 
int  ■'tag*; 

( 

unaignad  int  awteh, *outmap; 

if  (stage==3)  outmap=outmapk; else  outmap=outmapl ; 
if  (atag«z=0) 

foe  ( awtchsO ; swteh <=3 ; awteh*-*)  I 
switch! failstatel 0 )( swteh] )  ( 

caae  0 : aw itch ( aw itchatate 1 0 1 ( awteh ) ) ( 

caae  0 : l inea 1 1 ) | outmap ( awteh* 2 ) I =NULL ; 

lineal  1)  (outaiapl  2*swtch+l)  ]=MULL; 
break; 

caae  l:awitch?(0, awteh, outaiap) ; break; 
caae  2 : switch* (0, awteh, outatap ) ; 

) 

break; 

caae  1: lineal  11 1 outmap ( awteh* 2 ] )=1 inea ( 0) ( awteh* 2 1 ; 

lineal  1) ( outmap ( 2*awtch+l) l=linea(0] |swtch*2+l) ; 
break; 

caae  2: lineal  1 ] I outmap | awteh* 2 ) )=lines( 0 ) | awteh* 2+1) ; 

lineal  1)  (outaiapl  2*swtch+l)  )-lines(0)  (awteh* 2)  ; 
break; 

caae  3:linea(  1]  ( outaiapl  awteh*  2 )  )=NU1.1.; 

lineal  1  /  (outaiapl  2 *swtch*l  1 1=NULL; 

l  > 
elae 

for  ( swtch=0 ;  awteh  <  =3 ;  swtch++)  ( 
awitchl failatatel stage )( awteh] )  I 

caae  0: switch! awitchatatel stage) (awteh) ) I 

caae  0: lineal stage+l) I outmap I awteh* 2 ) )=NULL; 

lineal  atage-fl )  (outaiapl  2*swtch+l )  )=MUL1; 
break; 

caae  1  :switchT I  stage,  awteh,  outatap)  ;break; 
case  2 :  switch!  ( stage,  awteh,  outatap ) ; 

I 

break; 

caae  1; if ( awitchatatel stage) (swtch)=3sl)  switch?! stage, swteh, outmap) ; 
else  (  1 inea ( stage*  1]  ( outaiapl  awteh* 2)  |«NULL; 

L ines( stage* 1)  (outaiapl  2*awtch+l)  l=KULL; 
l 

break; 

caae  2: if I awitchatatel stage) I swteh) ^=2 )  awitchXI stage, awteh, outaiap) ; 
elae  (  lineal stagetl) (outmap! awteh* 2) |zNUll; 

1  inea | stage*  1)  (outaiapl  2*awtch*l)  ]=KULL; 

» 

break; 

caae  3:lines(atage*l)  | outaiap ( swteh* 2 )  j »KULL ; 

1  inea ( stage*  1)  (outaiapl  2*awtch-*i)  ]«hull; 

l  >  I 
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sw itch* ( stage , swtch, outnap ) 
int  stage, swtch, ‘outmap; 

l 

unsigned  int  ‘upper, ‘lower ; 
upper = lines (stage I ( 2‘swtchl ; 

lower=l  inest  stcge) l  2*swtch+ 1 )  ;  /*  try  upper+l‘/ 

if  (  upper ==MITLL )  lines (stage+1)  ( outmap (  2‘swtch+l  J  ) —NULL  ; 

else 

if  t  ( ‘upper  >>  3-stageS  1  )  =  1 )  lines)  stage-t-X )  I  outmap l  2‘swtch+l)  ) -upper ; 

else  lines  I stage+l)  (outmap)  2‘swtoh+l)  )=null; 
if)  lower =MULL )  1  ines  (stage+1  I  (outmap)  2‘swtch)  }=SULL  ; 
else 

if) (‘lower >  >3-stage£l )==0 )  1 ines I stage+1 ) | outmap I 2‘swtch) ) slower ; 

else  lines) stage+1 ) (outmap) 2‘swtch) ]=NULL; 


switch! ( stage, swtch, outmap ) 
int  stage, swtch, ‘outmap; 

I 

unsigned  int  ‘upper , ‘lower ; 
upper=l ines I  stage) I 2‘awtch] ; 

lower-1 ines) stage) ) 2‘swtch+l ) ;  /*  try  upper+l*/ 

if  ( upper—NULL )  1  ines ( stage+1 1  ( outmap (  2‘swtch)  J-NULL ; 

else 

if) ( ‘upper >> 3-stagefi l)—0)  lines  I stage+1) I outmap) 2‘swtch) ) -upper ; 

else  lines) stage+1 ) (outmap) 2‘swtch) )=NULL; 
if  I  lower— mull )  lines  1  stage+1 )  i  outmap  (  2‘swtch+i )  )=NULL ; 

else 

if) ( ‘lower >  >3-stage&l)-*l)  lines) stage+1 ) I outmap I 2‘awtch+l) ) slower ; 

else  lines  I stage+l) | outmap ( 2*swtch+l ) )=NULL ; 

> 

unsigned  int  setswitches ( stage) 
unsigned  int  stage; 
l 

unsigned  int  count, swtch, uppr eg, lowreq; 
count-0 ; 

uppreq- lowreq- 3 ; 
for (swtch*0; swtch <-3;swtch++)  I 
conf 1 icts I  stags ) ( swtch 1 =0 ; 
uppreq* lowreq* 3 ; 

if (lines (stage) (swteh‘2) )  uppreq*) ‘lines > stage) iswtch‘2) >>3-stagetl) ; 
if (lines  I  stage ) (swtch*2+i) ) 

lowreq* ( ‘lines) stagel ( swtch‘2+1 ) > >3-ntagetl ) ; 
if  ( fsilstate) stage) (swtch)**0)  I 

if(uppreq**3  fit  lowreq**3)  sw itchatate I  stage ) (swtch )*0; 
else  if (uppreq**3)  switchstate(stagel(swtch)*(lowreq)Tl:2; 
else  if ( lowreq** 3 )  switchstate) stage) (swtch)*) uppreq)?2: 1 ; 
else  if (lowreq**!  £6  uppreq**0)  switchstate) stage) (swtch )=1 
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else  if  (lowreq=0  kk  uppreq— 1)  sw  i  t  chat  ate  ( stage)  (  swtch 1=2 ; 
else  (switchstatel stage) (swtch)=l; 

conflicts (stage) [swtch)=l; 
count-4-f; ) 

) 

else  switchstatel stage) ( swtch )=failstate( stage) ( swtch) ; 

) 

return (count) ; 

) 


initswitchstate! ) 

I 

int  stage, swtch; 

for(stage=0;stage<=3;atage-M-)  for(swtch=0;swtch<*3;swtch++) I 
switchstatel stage) [swtchl=0; 
conf 1 icts I  stage ) I swtch ) =0 ; 

l 

» 

double  sim(stage) 
unsigned  int  stage; 

I 

unsigned  int  line, conflictstate, lastconfstate, stge; 
float  count; 
if  (atage=3 )  ( 
setsw itches! 3) ; 
set lineal  3) ; 
count=0 ; 

for(line=0;line<=7;line-f4)  if  ( lines  I  4  )  ( 1  ine)  '  =NU1.L )  count+=l.G; 

/*dd 

printinpf 5, 10) ; 
printall ( 5, 18) ; 
get char ( ) ; 

V 

return! count/ I l< <totalconf (O)-ftotalconf ( 1 l+totalconf | 2) ) ) ; 

> 

else  I 

totalconf | stage )~setsw itches ( stage) ; 
aetlinea! stage) ; 
count=a in(  stage* 1 ) ; 
lastconfstate 1 « < totalconf ( stage ) ; 

for  ( conf  1  ictatatee  1 ;  conf  1  ictatate <  lastconfstate ;  conf  1  ictstate-t* )  I 
aetconf 1 let ( stage, conf 1 ictatate) ; 
aetlines! stage) ; 
count-fas  i*(  stage*  1 ) ; 

> 

return! count) ; 

> 

> 
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aateoa£licttataga,eoatatata) 
uaaigaad  iat  ataga,coafatata; 

i 

uaaigaad  iat  awtcfc; 
feclavttkatfavtskoltavtthH) 
if tcoatlictalataga) tavtefc) )  ( 
if (eoafatatatl)  awitctatata(ataga) Iawtch|«2 
alaa  aw itotatata I ataga ) I awte* I « 1 ; 
coaf  atata>  »i  j 
> 


i 


